Message ID | 1558711908-15688-1-git-send-email-suzuki.poulose@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm, compaction: Make sure we isolate a valid PFN | expand |
On Fri, May 24, 2019 at 04:31:48PM +0100, Suzuki K Poulose wrote: > When we have holes in a normal memory zone, we could endup having > cached_migrate_pfns which may not necessarily be valid, under heavy memory > pressure with swapping enabled ( via __reset_isolation_suitable(), triggered > by kswapd). > > Later if we fail to find a page via fast_isolate_freepages(), we may > end up using the migrate_pfn we started the search with, as valid > page. This could lead to accessing NULL pointer derefernces like below, > due to an invalid mem_section pointer. > > Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 [47/1825] > Mem abort info: > ESR = 0x96000004 > Exception class = DABT (current EL), IL = 32 bits > SET = 0, FnV = 0 > EA = 0, S1PTW = 0 > Data abort info: > ISV = 0, ISS = 0x00000004 > CM = 0, WnR = 0 > user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000082f94ae9 > [0000000000000008] pgd=0000000000000000 > Internal error: Oops: 96000004 [#1] SMP > ... > CPU: 10 PID: 6080 Comm: qemu-system-aar Not tainted 510-rc1+ #6 > Hardware name: AmpereComputing(R) OSPREY EV-883832-X3-0001/OSPREY, BIOS 4819 09/25/2018 > pstate: 60000005 (nZCv daif -PAN -UAO) > pc : set_pfnblock_flags_mask+0x58/0xe8 > lr : compaction_alloc+0x300/0x950 > [...] > Process qemu-system-aar (pid: 6080, stack limit = 0x0000000095070da5) > Call trace: > set_pfnblock_flags_mask+0x58/0xe8 > compaction_alloc+0x300/0x950 > migrate_pages+0x1a4/0xbb0 > compact_zone+0x750/0xde8 > compact_zone_order+0xd8/0x118 > try_to_compact_pages+0xb4/0x290 > __alloc_pages_direct_compact+0x84/0x1e0 > __alloc_pages_nodemask+0x5e0/0xe18 > alloc_pages_vma+0x1cc/0x210 > do_huge_pmd_anonymous_page+0x108/0x7c8 > __handle_mm_fault+0xdd4/0x1190 > handle_mm_fault+0x114/0x1c0 > __get_user_pages+0x198/0x3c0 > get_user_pages_unlocked+0xb4/0x1d8 > __gfn_to_pfn_memslot+0x12c/0x3b8 > gfn_to_pfn_prot+0x4c/0x60 > kvm_handle_guest_abort+0x4b0/0xcd8 > handle_exit+0x140/0x1b8 > kvm_arch_vcpu_ioctl_run+0x260/0x768 > kvm_vcpu_ioctl+0x490/0x898 > do_vfs_ioctl+0xc4/0x898 > ksys_ioctl+0x8c/0xa0 > __arm64_sys_ioctl+0x28/0x38 > el0_svc_common+0x74/0x118 > el0_svc_handler+0x38/0x78 > el0_svc+0x8/0xc > Code: f8607840 f100001f 8b011401 9a801020 (f9400400) > ---[ end trace af6a35219325a9b6 ]--- > > The issue was reported on an arm64 server with 128GB with holes in the zone > (e.g, [32GB@4GB, 96GB@544GB]), with a swap device enabled, while running 100 KVM > guest instances. > > This patch fixes the issue by ensuring that the page belongs to a valid PFN > when we fallback to using the lower limit of the scan range upon failure in > fast_isolate_freepages(). > > Fixes: 5a811889de10f1eb ("mm, compaction: use free lists to quickly locate a migration target") > Reported-by: Marc Zyngier <marc.zyngier@arm.com> > Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Mel Gorman <mgorman@techsingularity.net>
On 05/24/2019 09:01 PM, Suzuki K Poulose wrote: > When we have holes in a normal memory zone, we could endup having > cached_migrate_pfns which may not necessarily be valid, under heavy memory > pressure with swapping enabled ( via __reset_isolation_suitable(), triggered > by kswapd). > > Later if we fail to find a page via fast_isolate_freepages(), we may > end up using the migrate_pfn we started the search with, as valid > page. This could lead to accessing NULL pointer derefernces like below, > due to an invalid mem_section pointer. > > Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 [47/1825] > Mem abort info: > ESR = 0x96000004 > Exception class = DABT (current EL), IL = 32 bits > SET = 0, FnV = 0 > EA = 0, S1PTW = 0 > Data abort info: > ISV = 0, ISS = 0x00000004 > CM = 0, WnR = 0 > user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000082f94ae9 > [0000000000000008] pgd=0000000000000000 > Internal error: Oops: 96000004 [#1] SMP > ... > CPU: 10 PID: 6080 Comm: qemu-system-aar Not tainted 510-rc1+ #6 > Hardware name: AmpereComputing(R) OSPREY EV-883832-X3-0001/OSPREY, BIOS 4819 09/25/2018 > pstate: 60000005 (nZCv daif -PAN -UAO) > pc : set_pfnblock_flags_mask+0x58/0xe8 > lr : compaction_alloc+0x300/0x950 > [...] > Process qemu-system-aar (pid: 6080, stack limit = 0x0000000095070da5) > Call trace: > set_pfnblock_flags_mask+0x58/0xe8 > compaction_alloc+0x300/0x950 > migrate_pages+0x1a4/0xbb0 > compact_zone+0x750/0xde8 > compact_zone_order+0xd8/0x118 > try_to_compact_pages+0xb4/0x290 > __alloc_pages_direct_compact+0x84/0x1e0 > __alloc_pages_nodemask+0x5e0/0xe18 > alloc_pages_vma+0x1cc/0x210 > do_huge_pmd_anonymous_page+0x108/0x7c8 > __handle_mm_fault+0xdd4/0x1190 > handle_mm_fault+0x114/0x1c0 > __get_user_pages+0x198/0x3c0 > get_user_pages_unlocked+0xb4/0x1d8 > __gfn_to_pfn_memslot+0x12c/0x3b8 > gfn_to_pfn_prot+0x4c/0x60 > kvm_handle_guest_abort+0x4b0/0xcd8 > handle_exit+0x140/0x1b8 > kvm_arch_vcpu_ioctl_run+0x260/0x768 > kvm_vcpu_ioctl+0x490/0x898 > do_vfs_ioctl+0xc4/0x898 > ksys_ioctl+0x8c/0xa0 > __arm64_sys_ioctl+0x28/0x38 > el0_svc_common+0x74/0x118 > el0_svc_handler+0x38/0x78 > el0_svc+0x8/0xc > Code: f8607840 f100001f 8b011401 9a801020 (f9400400) > ---[ end trace af6a35219325a9b6 ]--- > > The issue was reported on an arm64 server with 128GB with holes in the zone > (e.g, [32GB@4GB, 96GB@544GB]), with a swap device enabled, while running 100 KVM > guest instances. > > This patch fixes the issue by ensuring that the page belongs to a valid PFN > when we fallback to using the lower limit of the scan range upon failure in > fast_isolate_freepages(). > > Fixes: 5a811889de10f1eb ("mm, compaction: use free lists to quickly locate a migration target") > Reported-by: Marc Zyngier <marc.zyngier@arm.com> > Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
diff --git a/mm/compaction.c b/mm/compaction.c index 9febc8c..9e1b9ac 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -1399,7 +1399,7 @@ fast_isolate_freepages(struct compact_control *cc) page = pfn_to_page(highest); cc->free_pfn = highest; } else { - if (cc->direct_compaction) { + if (cc->direct_compaction && pfn_valid(min_pfn)) { page = pfn_to_page(min_pfn); cc->free_pfn = min_pfn; }
When we have holes in a normal memory zone, we could endup having cached_migrate_pfns which may not necessarily be valid, under heavy memory pressure with swapping enabled ( via __reset_isolation_suitable(), triggered by kswapd). Later if we fail to find a page via fast_isolate_freepages(), we may end up using the migrate_pfn we started the search with, as valid page. This could lead to accessing NULL pointer derefernces like below, due to an invalid mem_section pointer. Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 [47/1825] Mem abort info: ESR = 0x96000004 Exception class = DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000004 CM = 0, WnR = 0 user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000082f94ae9 [0000000000000008] pgd=0000000000000000 Internal error: Oops: 96000004 [#1] SMP ... CPU: 10 PID: 6080 Comm: qemu-system-aar Not tainted 510-rc1+ #6 Hardware name: AmpereComputing(R) OSPREY EV-883832-X3-0001/OSPREY, BIOS 4819 09/25/2018 pstate: 60000005 (nZCv daif -PAN -UAO) pc : set_pfnblock_flags_mask+0x58/0xe8 lr : compaction_alloc+0x300/0x950 [...] Process qemu-system-aar (pid: 6080, stack limit = 0x0000000095070da5) Call trace: set_pfnblock_flags_mask+0x58/0xe8 compaction_alloc+0x300/0x950 migrate_pages+0x1a4/0xbb0 compact_zone+0x750/0xde8 compact_zone_order+0xd8/0x118 try_to_compact_pages+0xb4/0x290 __alloc_pages_direct_compact+0x84/0x1e0 __alloc_pages_nodemask+0x5e0/0xe18 alloc_pages_vma+0x1cc/0x210 do_huge_pmd_anonymous_page+0x108/0x7c8 __handle_mm_fault+0xdd4/0x1190 handle_mm_fault+0x114/0x1c0 __get_user_pages+0x198/0x3c0 get_user_pages_unlocked+0xb4/0x1d8 __gfn_to_pfn_memslot+0x12c/0x3b8 gfn_to_pfn_prot+0x4c/0x60 kvm_handle_guest_abort+0x4b0/0xcd8 handle_exit+0x140/0x1b8 kvm_arch_vcpu_ioctl_run+0x260/0x768 kvm_vcpu_ioctl+0x490/0x898 do_vfs_ioctl+0xc4/0x898 ksys_ioctl+0x8c/0xa0 __arm64_sys_ioctl+0x28/0x38 el0_svc_common+0x74/0x118 el0_svc_handler+0x38/0x78 el0_svc+0x8/0xc Code: f8607840 f100001f 8b011401 9a801020 (f9400400) ---[ end trace af6a35219325a9b6 ]--- The issue was reported on an arm64 server with 128GB with holes in the zone (e.g, [32GB@4GB, 96GB@544GB]), with a swap device enabled, while running 100 KVM guest instances. This patch fixes the issue by ensuring that the page belongs to a valid PFN when we fallback to using the lower limit of the scan range upon failure in fast_isolate_freepages(). Fixes: 5a811889de10f1eb ("mm, compaction: use free lists to quickly locate a migration target") Reported-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> --- mm/compaction.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)