Message ID | 20240712024455.163543-2-zi.yan@sent.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Fix and refactor do_{huge_pmd_}numa_page() | expand |
Zi Yan <zi.yan@sent.com> writes: > From: Zi Yan <ziy@nvidia.com> > > last_cpupid is only available when memory tiering is off or the folio > is in toptier node. Complete the check to read last_cpupid when it is > available. > > Before the fix, the default last_cpupid will be used even if memory > tiering mode is turned off at runtime instead of the actual value. This > can prevent task_numa_fault() from getting right numa fault stats, but > should not cause any crash. User might see performance changes after the > fix. > > Fixes: 33024536bafd ("memory tiering: hot page selection with hint page fault latency") > Signed-off-by: Zi Yan <ziy@nvidia.com> Good catch! Thanks! Reviewed-by: "Huang, Ying" <ying.huang@intel.com> > --- > mm/huge_memory.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index d7c84480f1a4..07d9dde4ca33 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -1705,7 +1705,8 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf) > * For memory tiering mode, cpupid of slow memory page is used > * to record page access time. So use default value. > */ > - if (node_is_toptier(nid)) > + if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) || > + node_is_toptier(nid)) > last_cpupid = folio_last_cpupid(folio); > target_nid = numa_migrate_prep(folio, vmf, haddr, nid, &flags); > if (target_nid == NUMA_NO_NODE)
On 2024/7/12 10:44, Zi Yan wrote: > From: Zi Yan <ziy@nvidia.com> > > last_cpupid is only available when memory tiering is off or the folio > is in toptier node. Complete the check to read last_cpupid when it is > available. > > Before the fix, the default last_cpupid will be used even if memory > tiering mode is turned off at runtime instead of the actual value. This > can prevent task_numa_fault() from getting right numa fault stats, but > should not cause any crash. User might see performance changes after the > fix. > > Fixes: 33024536bafd ("memory tiering: hot page selection with hint page fault latency") > Signed-off-by: Zi Yan <ziy@nvidia.com> LGTM. Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> > --- > mm/huge_memory.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index d7c84480f1a4..07d9dde4ca33 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -1705,7 +1705,8 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf) > * For memory tiering mode, cpupid of slow memory page is used > * to record page access time. So use default value. > */ > - if (node_is_toptier(nid)) > + if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) || > + node_is_toptier(nid)) > last_cpupid = folio_last_cpupid(folio); > target_nid = numa_migrate_prep(folio, vmf, haddr, nid, &flags); > if (target_nid == NUMA_NO_NODE)
On 12.07.24 04:44, Zi Yan wrote: > From: Zi Yan <ziy@nvidia.com> > > last_cpupid is only available when memory tiering is off or the folio > is in toptier node. Complete the check to read last_cpupid when it is > available. > > Before the fix, the default last_cpupid will be used even if memory > tiering mode is turned off at runtime instead of the actual value. This > can prevent task_numa_fault() from getting right numa fault stats, but > should not cause any crash. User might see performance changes after the > fix. > > Fixes: 33024536bafd ("memory tiering: hot page selection with hint page fault latency") > Signed-off-by: Zi Yan <ziy@nvidia.com> > --- > mm/huge_memory.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index d7c84480f1a4..07d9dde4ca33 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -1705,7 +1705,8 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf) > * For memory tiering mode, cpupid of slow memory page is used > * to record page access time. So use default value. > */ > - if (node_is_toptier(nid)) > + if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) || > + node_is_toptier(nid)) > last_cpupid = folio_last_cpupid(folio); > target_nid = numa_migrate_prep(folio, vmf, haddr, nid, &flags); > if (target_nid == NUMA_NO_NODE) Reported-by: ... Closes: ... If it applies ;) Acked-by: David Hildenbrand <david@redhat.com>
On 12 Jul 2024, at 21:13, David Hildenbrand wrote: > On 12.07.24 04:44, Zi Yan wrote: >> From: Zi Yan <ziy@nvidia.com> >> >> last_cpupid is only available when memory tiering is off or the folio >> is in toptier node. Complete the check to read last_cpupid when it is >> available. >> >> Before the fix, the default last_cpupid will be used even if memory >> tiering mode is turned off at runtime instead of the actual value. This >> can prevent task_numa_fault() from getting right numa fault stats, but >> should not cause any crash. User might see performance changes after the >> fix. >> >> Fixes: 33024536bafd ("memory tiering: hot page selection with hint page fault latency") >> Signed-off-by: Zi Yan <ziy@nvidia.com> >> --- >> mm/huge_memory.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index d7c84480f1a4..07d9dde4ca33 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c >> @@ -1705,7 +1705,8 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf) >> * For memory tiering mode, cpupid of slow memory page is used >> * to record page access time. So use default value. >> */ >> - if (node_is_toptier(nid)) >> + if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) || >> + node_is_toptier(nid)) >> last_cpupid = folio_last_cpupid(folio); >> target_nid = numa_migrate_prep(folio, vmf, haddr, nid, &flags); >> if (target_nid == NUMA_NO_NODE) > > Reported-by: ... Reported-by: David Hildenbrand <david@redhat.com> I suppose your email[1] reports the issue based on code inspection. > Closes: ... Closes: https://lore.kernel.org/linux-mm/9af34a6b-ca56-4a64-8aa6-ade65f109288@redhat.com/ Will add them in the next version. > > If it applies ;) > > Acked-by: David Hildenbrand <david@redhat.com> Thanks. [1] https://lore.kernel.org/linux-mm/9af34a6b-ca56-4a64-8aa6-ade65f109288@redhat.com/ -- Best Regards, Yan, Zi
On 13.07.24 03:18, Zi Yan wrote: > On 12 Jul 2024, at 21:13, David Hildenbrand wrote: > >> On 12.07.24 04:44, Zi Yan wrote: >>> From: Zi Yan <ziy@nvidia.com> >>> >>> last_cpupid is only available when memory tiering is off or the folio >>> is in toptier node. Complete the check to read last_cpupid when it is >>> available. >>> >>> Before the fix, the default last_cpupid will be used even if memory >>> tiering mode is turned off at runtime instead of the actual value. This >>> can prevent task_numa_fault() from getting right numa fault stats, but >>> should not cause any crash. User might see performance changes after the >>> fix. >>> >>> Fixes: 33024536bafd ("memory tiering: hot page selection with hint page fault latency") >>> Signed-off-by: Zi Yan <ziy@nvidia.com> >>> --- >>> mm/huge_memory.c | 3 ++- >>> 1 file changed, 2 insertions(+), 1 deletion(-) >>> >>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >>> index d7c84480f1a4..07d9dde4ca33 100644 >>> --- a/mm/huge_memory.c >>> +++ b/mm/huge_memory.c >>> @@ -1705,7 +1705,8 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf) >>> * For memory tiering mode, cpupid of slow memory page is used >>> * to record page access time. So use default value. >>> */ >>> - if (node_is_toptier(nid)) >>> + if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) || >>> + node_is_toptier(nid)) >>> last_cpupid = folio_last_cpupid(folio); >>> target_nid = numa_migrate_prep(folio, vmf, haddr, nid, &flags); >>> if (target_nid == NUMA_NO_NODE) >> >> Reported-by: ... > > Reported-by: David Hildenbrand <david@redhat.com> > > I suppose your email[1] reports the issue based on code inspection. Yes, thanks for taking care of the fix and cleanups!
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index d7c84480f1a4..07d9dde4ca33 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1705,7 +1705,8 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf) * For memory tiering mode, cpupid of slow memory page is used * to record page access time. So use default value. */ - if (node_is_toptier(nid)) + if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) || + node_is_toptier(nid)) last_cpupid = folio_last_cpupid(folio); target_nid = numa_migrate_prep(folio, vmf, haddr, nid, &flags); if (target_nid == NUMA_NO_NODE)