Message ID | 20240910140625.175700-1-wangkefeng.wang@huawei.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: set hugepage to false when anon mthp allocation | expand |
On 2024/9/10 22:06, Kefeng Wang wrote: > When the hugepage parameter is true in vma_alloc_folio(), it indicates > that we only try allocation on preferred node if possible for PMD_ORDER, Should remove "for PMD_ORDER", I mean that it was used for PMD_ORDER, but for other high-order, it will reduce the success rate of allocation if without ddc1a5cbc05d. > but it could lead to lots of failures for large folio allocation, > luckily the hugepage parameter was deprecated since commit ddc1a5cbc05d > ("mempolicy: alloc_pages_mpol() for NUMA policy without vma"), so no > effect on runtime behavior. > > Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> > --- > > Found the issue when backport mthp to inner kernel without ddc1a5cbc05d, > but for mainline, there is no issue, no clue why hugepage parameter was > retained, maybe just kill the parameter for mainline? > > mm/memory.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/memory.c b/mm/memory.c > index b84443e689a8..89a15858348a 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -4479,7 +4479,7 @@ static struct folio *alloc_anon_folio(struct vm_fault *vmf) > gfp = vma_thp_gfp_mask(vma); > while (orders) { > addr = ALIGN_DOWN(vmf->address, PAGE_SIZE << order); > - folio = vma_alloc_folio(gfp, order, vma, addr, true); > + folio = vma_alloc_folio(gfp, order, vma, addr, false); > if (folio) { > if (mem_cgroup_charge(folio, vma->vm_mm, gfp)) { > count_mthp_stat(order, MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE);
Hi All, On 2024/9/10 22:18, Kefeng Wang wrote: > > > On 2024/9/10 22:06, Kefeng Wang wrote: >> When the hugepage parameter is true in vma_alloc_folio(), it indicates >> that we only try allocation on preferred node if possible for PMD_ORDER, > > Should remove "for PMD_ORDER", I mean that it was used for PMD_ORDER, > but for other high-order, it will reduce the success rate of allocation > if without ddc1a5cbc05d. > > >> but it could lead to lots of failures for large folio allocation, >> luckily the hugepage parameter was deprecated since commit ddc1a5cbc05d >> ("mempolicy: alloc_pages_mpol() for NUMA policy without vma"), so no >> effect on runtime behavior. >> >> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> >> --- >> >> Found the issue when backport mthp to inner kernel without ddc1a5cbc05d, >> but for mainline, there is no issue, no clue why hugepage parameter was >> retained, maybe just kill the parameter for mainline? Any comments, fix in alloc_anon_folio() or remove hugepage parameter in vma_alloc_folio(), thanks. >> >> mm/memory.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/mm/memory.c b/mm/memory.c >> index b84443e689a8..89a15858348a 100644 >> --- a/mm/memory.c >> +++ b/mm/memory.c >> @@ -4479,7 +4479,7 @@ static struct folio *alloc_anon_folio(struct >> vm_fault *vmf) >> gfp = vma_thp_gfp_mask(vma); >> while (orders) { >> addr = ALIGN_DOWN(vmf->address, PAGE_SIZE << order); >> - folio = vma_alloc_folio(gfp, order, vma, addr, true); >> + folio = vma_alloc_folio(gfp, order, vma, addr, false); >> if (folio) { >> if (mem_cgroup_charge(folio, vma->vm_mm, gfp)) { >> count_mthp_stat(order, >> MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE); >
On 2024/9/13 18:36, Kefeng Wang wrote: > Hi All, > > On 2024/9/10 22:18, Kefeng Wang wrote: >> >> >> On 2024/9/10 22:06, Kefeng Wang wrote: >>> When the hugepage parameter is true in vma_alloc_folio(), it indicates >>> that we only try allocation on preferred node if possible for PMD_ORDER, >> >> Should remove "for PMD_ORDER", I mean that it was used for PMD_ORDER, >> but for other high-order, it will reduce the success rate of >> allocation if without ddc1a5cbc05d. >> >> >>> but it could lead to lots of failures for large folio allocation, >>> luckily the hugepage parameter was deprecated since commit ddc1a5cbc05d >>> ("mempolicy: alloc_pages_mpol() for NUMA policy without vma"), so no >>> effect on runtime behavior. >>> >>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> >>> --- >>> >>> Found the issue when backport mthp to inner kernel without ddc1a5cbc05d, >>> but for mainline, there is no issue, no clue why hugepage parameter was >>> retained, maybe just kill the parameter for mainline? > > > Any comments, fix in alloc_anon_folio() or remove hugepage parameter in > vma_alloc_folio(), thanks. * vma_alloc_folio - Allocate a folio for a VMA. @hugepage: Unused (was: For hugepages try only preferred node if possible). Since hugepage won't be used in vma_alloc_folio(), maybe just delete this parameter? > >>> >>> mm/memory.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/mm/memory.c b/mm/memory.c >>> index b84443e689a8..89a15858348a 100644 >>> --- a/mm/memory.c >>> +++ b/mm/memory.c >>> @@ -4479,7 +4479,7 @@ static struct folio *alloc_anon_folio(struct >>> vm_fault *vmf) >>> gfp = vma_thp_gfp_mask(vma); >>> while (orders) { >>> addr = ALIGN_DOWN(vmf->address, PAGE_SIZE << order); >>> - folio = vma_alloc_folio(gfp, order, vma, addr, true); >>> + folio = vma_alloc_folio(gfp, order, vma, addr, false); >>> if (folio) { >>> if (mem_cgroup_charge(folio, vma->vm_mm, gfp)) { >>> count_mthp_stat(order, >>> MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE); >>
On 09/10/2024 10:15, Kefeng Wang wrote: > > On 2024/9/13 18:36, Kefeng Wang wrote: >> Hi All, >> >> On 2024/9/10 22:18, Kefeng Wang wrote: >>> >>> >>> On 2024/9/10 22:06, Kefeng Wang wrote: >>>> When the hugepage parameter is true in vma_alloc_folio(), it indicates >>>> that we only try allocation on preferred node if possible for PMD_ORDER, >>> >>> Should remove "for PMD_ORDER", I mean that it was used for PMD_ORDER, but for >>> other high-order, it will reduce the success rate of allocation if without >>> ddc1a5cbc05d. >>> >>> >>>> but it could lead to lots of failures for large folio allocation, >>>> luckily the hugepage parameter was deprecated since commit ddc1a5cbc05d >>>> ("mempolicy: alloc_pages_mpol() for NUMA policy without vma"), so no >>>> effect on runtime behavior. >>>> >>>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> >>>> --- >>>> >>>> Found the issue when backport mthp to inner kernel without ddc1a5cbc05d, >>>> but for mainline, there is no issue, no clue why hugepage parameter was >>>> retained, maybe just kill the parameter for mainline? >> >> >> Any comments, fix in alloc_anon_folio() or remove hugepage parameter in >> vma_alloc_folio(), thanks. > > * vma_alloc_folio - Allocate a folio for a VMA. > @hugepage: Unused (was: For hugepages try only preferred node if possible). > > Since hugepage won't be used in vma_alloc_folio(), maybe just delete this > parameter? Sorry for the radio silence. Given the param is no longer used, I think it would be cleaner to just remove it. It was set to true here on purpose though; the aim was to follow the pattern set by PMD-sized THP, which also sets it to true. And the aargument was that the benefit of having a huge page would be outstripped by having to access it on a remote node. Now that the parameter is deprecated, do you know if the policy is still enforced by other means? Thanks, Ryan > >> >>>> >>>> mm/memory.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/mm/memory.c b/mm/memory.c >>>> index b84443e689a8..89a15858348a 100644 >>>> --- a/mm/memory.c >>>> +++ b/mm/memory.c >>>> @@ -4479,7 +4479,7 @@ static struct folio *alloc_anon_folio(struct vm_fault >>>> *vmf) >>>> gfp = vma_thp_gfp_mask(vma); >>>> while (orders) { >>>> addr = ALIGN_DOWN(vmf->address, PAGE_SIZE << order); >>>> - folio = vma_alloc_folio(gfp, order, vma, addr, true); >>>> + folio = vma_alloc_folio(gfp, order, vma, addr, false); >>>> if (folio) { >>>> if (mem_cgroup_charge(folio, vma->vm_mm, gfp)) { >>>> count_mthp_stat(order, MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE); >>> >
On 09.10.24 12:44, Ryan Roberts wrote: > On 09/10/2024 10:15, Kefeng Wang wrote: >> >> On 2024/9/13 18:36, Kefeng Wang wrote: >>> Hi All, >>> >>> On 2024/9/10 22:18, Kefeng Wang wrote: >>>> >>>> >>>> On 2024/9/10 22:06, Kefeng Wang wrote: >>>>> When the hugepage parameter is true in vma_alloc_folio(), it indicates >>>>> that we only try allocation on preferred node if possible for PMD_ORDER, >>>> >>>> Should remove "for PMD_ORDER", I mean that it was used for PMD_ORDER, but for >>>> other high-order, it will reduce the success rate of allocation if without >>>> ddc1a5cbc05d. >>>> >>>> >>>>> but it could lead to lots of failures for large folio allocation, >>>>> luckily the hugepage parameter was deprecated since commit ddc1a5cbc05d >>>>> ("mempolicy: alloc_pages_mpol() for NUMA policy without vma"), so no >>>>> effect on runtime behavior. >>>>> >>>>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> >>>>> --- >>>>> >>>>> Found the issue when backport mthp to inner kernel without ddc1a5cbc05d, >>>>> but for mainline, there is no issue, no clue why hugepage parameter was >>>>> retained, maybe just kill the parameter for mainline? >>> >>> >>> Any comments, fix in alloc_anon_folio() or remove hugepage parameter in >>> vma_alloc_folio(), thanks. >> >> * vma_alloc_folio - Allocate a folio for a VMA. >> @hugepage: Unused (was: For hugepages try only preferred node if possible). >> >> Since hugepage won't be used in vma_alloc_folio(), maybe just delete this >> parameter? > > Sorry for the radio silence. Given the param is no longer used, I think it would > be cleaner to just remove it. Agreed, no dead code. > > It was set to true here on purpose though; the aim was to follow the pattern set > by PMD-sized THP, which also sets it to true. And the aargument was that the > benefit of having a huge page would be outstripped by having to access it on a > remote node. > > Now that the parameter is deprecated, do you know if the policy is still > enforced by other means? Right, it might indicate a bug. So figuring out why there are no users left would be interesting. Maybe it was all on purpose.
On 2024/10/9 22:28, David Hildenbrand wrote: > On 09.10.24 12:44, Ryan Roberts wrote: >> On 09/10/2024 10:15, Kefeng Wang wrote: >>> >>> On 2024/9/13 18:36, Kefeng Wang wrote: >>>> Hi All, >>>> >>>> On 2024/9/10 22:18, Kefeng Wang wrote: >>>>> >>>>> >>>>> On 2024/9/10 22:06, Kefeng Wang wrote: >>>>>> When the hugepage parameter is true in vma_alloc_folio(), it >>>>>> indicates >>>>>> that we only try allocation on preferred node if possible for >>>>>> PMD_ORDER, >>>>> >>>>> Should remove "for PMD_ORDER", I mean that it was used for >>>>> PMD_ORDER, but for >>>>> other high-order, it will reduce the success rate of allocation if >>>>> without >>>>> ddc1a5cbc05d. >>>>> >>>>> >>>>>> but it could lead to lots of failures for large folio allocation, >>>>>> luckily the hugepage parameter was deprecated since commit >>>>>> ddc1a5cbc05d >>>>>> ("mempolicy: alloc_pages_mpol() for NUMA policy without vma"), so no >>>>>> effect on runtime behavior. >>>>>> >>>>>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> >>>>>> --- >>>>>> >>>>>> Found the issue when backport mthp to inner kernel without >>>>>> ddc1a5cbc05d, >>>>>> but for mainline, there is no issue, no clue why hugepage >>>>>> parameter was >>>>>> retained, maybe just kill the parameter for mainline? >>>> >>>> >>>> Any comments, fix in alloc_anon_folio() or remove hugepage parameter in >>>> vma_alloc_folio(), thanks. >>> >>> * vma_alloc_folio - Allocate a folio for a VMA. >>> @hugepage: Unused (was: For hugepages try only preferred node if >>> possible). >>> >>> Since hugepage won't be used in vma_alloc_folio(), maybe just delete >>> this >>> parameter? >> >> Sorry for the radio silence. Given the param is no longer used, I >> think it would >> be cleaner to just remove it. > > Agreed, no dead code. Sure. > >> >> It was set to true here on purpose though; the aim was to follow the >> pattern set >> by PMD-sized THP, which also sets it to true. And the aargument was >> that the >> benefit of having a huge page would be outstripped by having to access >> it on a >> remote node.>> >> Now that the parameter is deprecated, do you know if the policy is still >> enforced by other means? > > Right, it might indicate a bug. So figuring out why there are no users > left would be interesting. Maybe it was all on purpose. > Before v6.7 ddc1a5cbc05d(mTHP from v6.8), it checks hugepage parameter (only for PMD THP), after that commit, we check the order == PMD_ORDER, so no different for PMD THP, but for other high-order, it doesn't follow the pattern set by PMD THP. if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) && /* filter "hugepage" allocation, unless from alloc_pages() */ order == HPAGE_PMD_ORDER && ilx != NO_INTERLEAVE_INDEX) { "The interleave index is almost always irrelevant unless MPOL_INTERLEAVE: with one exception in alloc_pages_mpol(), where the NO_INTERLEAVE_INDEX passed down from vma-less alloc_pages() is also used as hint not to use THP-style hugepage allocation - to avoid the overhead of a hugepage arg (though I don't understand why we never just added a GFP bit for THP - if it actually needs a different allocation strategy from other pages of the same order). vma_alloc_folio() still carries its hugepage arg here, but it is not used, and should be removed when agreed. " For Hugh's changelog, it seems that we could just remove it, but since no mTHP when made this change, Hugh didn't take into account for other high-order folio allocation.
diff --git a/mm/memory.c b/mm/memory.c index b84443e689a8..89a15858348a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4479,7 +4479,7 @@ static struct folio *alloc_anon_folio(struct vm_fault *vmf) gfp = vma_thp_gfp_mask(vma); while (orders) { addr = ALIGN_DOWN(vmf->address, PAGE_SIZE << order); - folio = vma_alloc_folio(gfp, order, vma, addr, true); + folio = vma_alloc_folio(gfp, order, vma, addr, false); if (folio) { if (mem_cgroup_charge(folio, vma->vm_mm, gfp)) { count_mthp_stat(order, MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE);
When the hugepage parameter is true in vma_alloc_folio(), it indicates that we only try allocation on preferred node if possible for PMD_ORDER, but it could lead to lots of failures for large folio allocation, luckily the hugepage parameter was deprecated since commit ddc1a5cbc05d ("mempolicy: alloc_pages_mpol() for NUMA policy without vma"), so no effect on runtime behavior. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> --- Found the issue when backport mthp to inner kernel without ddc1a5cbc05d, but for mainline, there is no issue, no clue why hugepage parameter was retained, maybe just kill the parameter for mainline? mm/memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)