Message ID | 20200403123059.1846960-1-ying.huang@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [-V3] /proc/PID/smaps: Add PMD migration entry parsing | expand |
On Fri 03-04-20 20:30:59, Huang, Ying wrote: > From: Huang Ying <ying.huang@intel.com> > > Now, when read /proc/PID/smaps, the PMD migration entry in page table is simply > ignored. To improve the accuracy of /proc/PID/smaps, its parsing and processing > is added. > > To test the patch, we run pmbench to eat 400 MB memory in background, then run > /usr/bin/migratepages and `cat /proc/PID/smaps` every second. The issue as > follows can be reproduced within 60 seconds. > > Before the patch, for the fully populated 400 MB anonymous VMA, some THP pages > under migration may be lost as below. > > 7f3f6a7e5000-7f3f837e5000 rw-p 00000000 00:00 0 > Size: 409600 kB > KernelPageSize: 4 kB > MMUPageSize: 4 kB > Rss: 407552 kB > Pss: 407552 kB > Shared_Clean: 0 kB > Shared_Dirty: 0 kB > Private_Clean: 0 kB > Private_Dirty: 407552 kB > Referenced: 301056 kB > Anonymous: 407552 kB > LazyFree: 0 kB > AnonHugePages: 405504 kB > ShmemPmdMapped: 0 kB > FilePmdMapped: 0 kB > Shared_Hugetlb: 0 kB > Private_Hugetlb: 0 kB > Swap: 0 kB > SwapPss: 0 kB > Locked: 0 kB > THPeligible: 1 > VmFlags: rd wr mr mw me ac > > After the patch, it will be always, > > 7f3f6a7e5000-7f3f837e5000 rw-p 00000000 00:00 0 > Size: 409600 kB > KernelPageSize: 4 kB > MMUPageSize: 4 kB > Rss: 409600 kB > Pss: 409600 kB > Shared_Clean: 0 kB > Shared_Dirty: 0 kB > Private_Clean: 0 kB > Private_Dirty: 409600 kB > Referenced: 294912 kB > Anonymous: 409600 kB > LazyFree: 0 kB > AnonHugePages: 407552 kB > ShmemPmdMapped: 0 kB > FilePmdMapped: 0 kB > Shared_Hugetlb: 0 kB > Private_Hugetlb: 0 kB > Swap: 0 kB > SwapPss: 0 kB > Locked: 0 kB > THPeligible: 1 > VmFlags: rd wr mr mw me ac > > Signed-off-by: "Huang, Ying" <ying.huang@intel.com> > Reviewed-by: Zi Yan <ziy@nvidia.com> > Cc: Andrea Arcangeli <aarcange@redhat.com> > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: Alexey Dobriyan <adobriyan@gmail.com> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> > Cc: "Jérôme Glisse" <jglisse@redhat.com> > Cc: Yang Shi <yang.shi@linux.alibaba.com> Acked-by: Michal Hocko <mhocko@suse.com> Thanks! > --- > > v3: > > - Revised patch description and remove VM_WARN_ON_ONCE() per Michal's comments > > v2: > > - Use thp_migration_supported() in condition to reduce code size if THP > migration isn't enabled. > > - Replace VM_BUG_ON() with VM_WARN_ON_ONCE(), it's not necessary to nuking > kernel for this. > > --- > fs/proc/task_mmu.c | 16 +++++++++++----- > 1 file changed, 11 insertions(+), 5 deletions(-) > > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > index 8d382d4ec067..36dc7417c0df 100644 > --- a/fs/proc/task_mmu.c > +++ b/fs/proc/task_mmu.c > @@ -546,10 +546,17 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, > struct mem_size_stats *mss = walk->private; > struct vm_area_struct *vma = walk->vma; > bool locked = !!(vma->vm_flags & VM_LOCKED); > - struct page *page; > + struct page *page = NULL; > + > + if (pmd_present(*pmd)) { > + /* FOLL_DUMP will return -EFAULT on huge zero page */ > + page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); > + } else if (unlikely(thp_migration_supported() && is_swap_pmd(*pmd))) { > + swp_entry_t entry = pmd_to_swp_entry(*pmd); > > - /* FOLL_DUMP will return -EFAULT on huge zero page */ > - page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); > + if (is_migration_entry(entry)) > + page = migration_entry_to_page(entry); > + } > if (IS_ERR_OR_NULL(page)) > return; > if (PageAnon(page)) > @@ -578,8 +585,7 @@ static int smaps_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, > > ptl = pmd_trans_huge_lock(pmd, vma); > if (ptl) { > - if (pmd_present(*pmd)) > - smaps_pmd_entry(pmd, addr, walk); > + smaps_pmd_entry(pmd, addr, walk); > spin_unlock(ptl); > goto out; > } > -- > 2.25.0
On Fri, Apr 03, 2020 at 08:30:59PM +0800, Huang, Ying wrote: > From: Huang Ying <ying.huang@intel.com> > > Now, when read /proc/PID/smaps, the PMD migration entry in page table is simply > ignored. To improve the accuracy of /proc/PID/smaps, its parsing and processing > is added. > > To test the patch, we run pmbench to eat 400 MB memory in background, then run > /usr/bin/migratepages and `cat /proc/PID/smaps` every second. The issue as > follows can be reproduced within 60 seconds. > > Before the patch, for the fully populated 400 MB anonymous VMA, some THP pages > under migration may be lost as below. > > 7f3f6a7e5000-7f3f837e5000 rw-p 00000000 00:00 0 > Size: 409600 kB > KernelPageSize: 4 kB > MMUPageSize: 4 kB > Rss: 407552 kB > Pss: 407552 kB > Shared_Clean: 0 kB > Shared_Dirty: 0 kB > Private_Clean: 0 kB > Private_Dirty: 407552 kB > Referenced: 301056 kB > Anonymous: 407552 kB > LazyFree: 0 kB > AnonHugePages: 405504 kB > ShmemPmdMapped: 0 kB > FilePmdMapped: 0 kB > Shared_Hugetlb: 0 kB > Private_Hugetlb: 0 kB > Swap: 0 kB > SwapPss: 0 kB > Locked: 0 kB > THPeligible: 1 > VmFlags: rd wr mr mw me ac > > After the patch, it will be always, > > 7f3f6a7e5000-7f3f837e5000 rw-p 00000000 00:00 0 > Size: 409600 kB > KernelPageSize: 4 kB > MMUPageSize: 4 kB > Rss: 409600 kB > Pss: 409600 kB > Shared_Clean: 0 kB > Shared_Dirty: 0 kB > Private_Clean: 0 kB > Private_Dirty: 409600 kB > Referenced: 294912 kB > Anonymous: 409600 kB > LazyFree: 0 kB > AnonHugePages: 407552 kB > ShmemPmdMapped: 0 kB > FilePmdMapped: 0 kB > Shared_Hugetlb: 0 kB > Private_Hugetlb: 0 kB > Swap: 0 kB > SwapPss: 0 kB > Locked: 0 kB > THPeligible: 1 > VmFlags: rd wr mr mw me ac > > Signed-off-by: "Huang, Ying" <ying.huang@intel.com> > Reviewed-by: Zi Yan <ziy@nvidia.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
On 4/3/20 2:30 PM, Huang, Ying wrote: > From: Huang Ying <ying.huang@intel.com> > > Now, when read /proc/PID/smaps, the PMD migration entry in page table is simply > ignored. To improve the accuracy of /proc/PID/smaps, its parsing and processing > is added. > > To test the patch, we run pmbench to eat 400 MB memory in background, then run > /usr/bin/migratepages and `cat /proc/PID/smaps` every second. The issue as > follows can be reproduced within 60 seconds. > > Before the patch, for the fully populated 400 MB anonymous VMA, some THP pages > under migration may be lost as below. > > 7f3f6a7e5000-7f3f837e5000 rw-p 00000000 00:00 0 > Size: 409600 kB > KernelPageSize: 4 kB > MMUPageSize: 4 kB > Rss: 407552 kB > Pss: 407552 kB > Shared_Clean: 0 kB > Shared_Dirty: 0 kB > Private_Clean: 0 kB > Private_Dirty: 407552 kB > Referenced: 301056 kB > Anonymous: 407552 kB > LazyFree: 0 kB > AnonHugePages: 405504 kB > ShmemPmdMapped: 0 kB > FilePmdMapped: 0 kB > Shared_Hugetlb: 0 kB > Private_Hugetlb: 0 kB > Swap: 0 kB > SwapPss: 0 kB > Locked: 0 kB > THPeligible: 1 > VmFlags: rd wr mr mw me ac > > After the patch, it will be always, > > 7f3f6a7e5000-7f3f837e5000 rw-p 00000000 00:00 0 > Size: 409600 kB > KernelPageSize: 4 kB > MMUPageSize: 4 kB > Rss: 409600 kB > Pss: 409600 kB > Shared_Clean: 0 kB > Shared_Dirty: 0 kB > Private_Clean: 0 kB > Private_Dirty: 409600 kB > Referenced: 294912 kB > Anonymous: 409600 kB > LazyFree: 0 kB > AnonHugePages: 407552 kB > ShmemPmdMapped: 0 kB > FilePmdMapped: 0 kB > Shared_Hugetlb: 0 kB > Private_Hugetlb: 0 kB > Swap: 0 kB > SwapPss: 0 kB > Locked: 0 kB > THPeligible: 1 > VmFlags: rd wr mr mw me ac > > Signed-off-by: "Huang, Ying" <ying.huang@intel.com> > Reviewed-by: Zi Yan <ziy@nvidia.com> > Cc: Andrea Arcangeli <aarcange@redhat.com> > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: Alexey Dobriyan <adobriyan@gmail.com> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> > Cc: "Jérôme Glisse" <jglisse@redhat.com> > Cc: Yang Shi <yang.shi@linux.alibaba.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Thanks!
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 8d382d4ec067..36dc7417c0df 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -546,10 +546,17 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, struct mem_size_stats *mss = walk->private; struct vm_area_struct *vma = walk->vma; bool locked = !!(vma->vm_flags & VM_LOCKED); - struct page *page; + struct page *page = NULL; + + if (pmd_present(*pmd)) { + /* FOLL_DUMP will return -EFAULT on huge zero page */ + page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); + } else if (unlikely(thp_migration_supported() && is_swap_pmd(*pmd))) { + swp_entry_t entry = pmd_to_swp_entry(*pmd); - /* FOLL_DUMP will return -EFAULT on huge zero page */ - page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); + if (is_migration_entry(entry)) + page = migration_entry_to_page(entry); + } if (IS_ERR_OR_NULL(page)) return; if (PageAnon(page)) @@ -578,8 +585,7 @@ static int smaps_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, ptl = pmd_trans_huge_lock(pmd, vma); if (ptl) { - if (pmd_present(*pmd)) - smaps_pmd_entry(pmd, addr, walk); + smaps_pmd_entry(pmd, addr, walk); spin_unlock(ptl); goto out; }