Message ID | 20240922151708.33949-1-aha310510@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: migrate: fix data-race in migrate_folio_unmap() | expand |
On 22.09.24 17:17, Jeongjun Park wrote: > I found a report from syzbot [1] > > When __folio_test_movable() is called in migrate_folio_unmap() to read > folio->mapping, a data race occurs because the folio is read without > protecting it with folio_lock. > > This can cause unintended behavior because folio->mapping is initialized > to a NULL value. Therefore, I think it is appropriate to call > __folio_test_movable() under the protection of folio_lock to prevent > data-race. > We hold a folio reference, would we really see PAGE_MAPPING_MOVABLE flip? Hmm Even a racing __ClearPageMovable() would still leave PAGE_MAPPING_MOVABLE set. > [1] > > ================================================================== > BUG: KCSAN: data-race in __filemap_remove_folio / migrate_pages_batch > > write to 0xffffea0004b81dd8 of 8 bytes by task 6348 on cpu 0: > page_cache_delete mm/filemap.c:153 [inline] > __filemap_remove_folio+0x1ac/0x2c0 mm/filemap.c:233 > filemap_remove_folio+0x6b/0x1f0 mm/filemap.c:265 > truncate_inode_folio+0x42/0x50 mm/truncate.c:178 > shmem_undo_range+0x25b/0xa70 mm/shmem.c:1028 > shmem_truncate_range mm/shmem.c:1144 [inline] > shmem_evict_inode+0x14d/0x530 mm/shmem.c:1272 > evict+0x2f0/0x580 fs/inode.c:731 > iput_final fs/inode.c:1883 [inline] > iput+0x42a/0x5b0 fs/inode.c:1909 > dentry_unlink_inode+0x24f/0x260 fs/dcache.c:412 > __dentry_kill+0x18b/0x4c0 fs/dcache.c:615 > dput+0x5c/0xd0 fs/dcache.c:857 > __fput+0x3fb/0x6d0 fs/file_table.c:439 > ____fput+0x1c/0x30 fs/file_table.c:459 > task_work_run+0x13a/0x1a0 kernel/task_work.c:228 > resume_user_mode_work include/linux/resume_user_mode.h:50 [inline] > exit_to_user_mode_loop kernel/entry/common.c:114 [inline] > exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline] > __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline] > syscall_exit_to_user_mode+0xbe/0x130 kernel/entry/common.c:218 > do_syscall_64+0xd6/0x1c0 arch/x86/entry/common.c:89 > entry_SYSCALL_64_after_hwframe+0x77/0x7f > > read to 0xffffea0004b81dd8 of 8 bytes by task 6342 on cpu 1: > __folio_test_movable include/linux/page-flags.h:699 [inline] > migrate_folio_unmap mm/migrate.c:1199 [inline] > migrate_pages_batch+0x24c/0x1940 mm/migrate.c:1797 > migrate_pages_sync mm/migrate.c:1963 [inline] > migrate_pages+0xff1/0x1820 mm/migrate.c:2072 > do_mbind mm/mempolicy.c:1390 [inline] > kernel_mbind mm/mempolicy.c:1533 [inline] > __do_sys_mbind mm/mempolicy.c:1607 [inline] > __se_sys_mbind+0xf76/0x1160 mm/mempolicy.c:1603 > __x64_sys_mbind+0x78/0x90 mm/mempolicy.c:1603 > x64_sys_call+0x2b4d/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:238 > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > do_syscall_64+0xc9/0x1c0 arch/x86/entry/common.c:83 > entry_SYSCALL_64_after_hwframe+0x77/0x7f > > value changed: 0xffff888127601078 -> 0x0000000000000000 Note that this doesn't flip PAGE_MAPPING_MOVABLE, just some unrelated bits. > > Reported-by: syzbot <syzkaller@googlegroups.com> > Cc: stable@vger.kernel.org > Fixes: 7e2a5e5ab217 ("mm: migrate: use __folio_test_movable()") > Signed-off-by: Jeongjun Park <aha310510@gmail.com> > --- > mm/migrate.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/mm/migrate.c b/mm/migrate.c > index 923ea80ba744..e62dac12406b 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -1118,7 +1118,7 @@ static int migrate_folio_unmap(new_folio_t get_new_folio, > int rc = -EAGAIN; > int old_page_state = 0; > struct anon_vma *anon_vma = NULL; > - bool is_lru = !__folio_test_movable(src); > + bool is_lru; > bool locked = false; > bool dst_locked = false; > > @@ -1172,6 +1172,7 @@ static int migrate_folio_unmap(new_folio_t get_new_folio, > locked = true; > if (folio_test_mlocked(src)) > old_page_state |= PAGE_WAS_MLOCKED; > + is_lru = !__folio_test_movable(src); Looks straight forward, though Acked-by: David Hildenbrand <david@redhat.com>
On Mon, Sep 23, 2024 at 05:56:40PM +0200, David Hildenbrand wrote: > On 22.09.24 17:17, Jeongjun Park wrote: > > I found a report from syzbot [1] > > > > When __folio_test_movable() is called in migrate_folio_unmap() to read > > folio->mapping, a data race occurs because the folio is read without > > protecting it with folio_lock. > > > > This can cause unintended behavior because folio->mapping is initialized > > to a NULL value. Therefore, I think it is appropriate to call > > __folio_test_movable() under the protection of folio_lock to prevent > > data-race. > > We hold a folio reference, would we really see PAGE_MAPPING_MOVABLE flip? > Hmm No; this shows a page cache folio getting truncated. It's fine; really a false alarm from the tool. I don't think the proposed patch introduces any problems, but it's all a bit meh. > Even a racing __ClearPageMovable() would still leave PAGE_MAPPING_MOVABLE > set. > > > [1] > > > > ================================================================== > > BUG: KCSAN: data-race in __filemap_remove_folio / migrate_pages_batch > > > > write to 0xffffea0004b81dd8 of 8 bytes by task 6348 on cpu 0: > > page_cache_delete mm/filemap.c:153 [inline] > > __filemap_remove_folio+0x1ac/0x2c0 mm/filemap.c:233 > > filemap_remove_folio+0x6b/0x1f0 mm/filemap.c:265 > > truncate_inode_folio+0x42/0x50 mm/truncate.c:178 > > shmem_undo_range+0x25b/0xa70 mm/shmem.c:1028 > > shmem_truncate_range mm/shmem.c:1144 [inline] > > shmem_evict_inode+0x14d/0x530 mm/shmem.c:1272 > > evict+0x2f0/0x580 fs/inode.c:731 > > iput_final fs/inode.c:1883 [inline] > > iput+0x42a/0x5b0 fs/inode.c:1909 > > dentry_unlink_inode+0x24f/0x260 fs/dcache.c:412 > > __dentry_kill+0x18b/0x4c0 fs/dcache.c:615 > > dput+0x5c/0xd0 fs/dcache.c:857 > > __fput+0x3fb/0x6d0 fs/file_table.c:439 > > ____fput+0x1c/0x30 fs/file_table.c:459 > > task_work_run+0x13a/0x1a0 kernel/task_work.c:228 > > resume_user_mode_work include/linux/resume_user_mode.h:50 [inline] > > exit_to_user_mode_loop kernel/entry/common.c:114 [inline] > > exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline] > > __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline] > > syscall_exit_to_user_mode+0xbe/0x130 kernel/entry/common.c:218 > > do_syscall_64+0xd6/0x1c0 arch/x86/entry/common.c:89 > > entry_SYSCALL_64_after_hwframe+0x77/0x7f > > > > read to 0xffffea0004b81dd8 of 8 bytes by task 6342 on cpu 1: > > __folio_test_movable include/linux/page-flags.h:699 [inline] > > migrate_folio_unmap mm/migrate.c:1199 [inline] > > migrate_pages_batch+0x24c/0x1940 mm/migrate.c:1797 > > migrate_pages_sync mm/migrate.c:1963 [inline] > > migrate_pages+0xff1/0x1820 mm/migrate.c:2072 > > do_mbind mm/mempolicy.c:1390 [inline] > > kernel_mbind mm/mempolicy.c:1533 [inline] > > __do_sys_mbind mm/mempolicy.c:1607 [inline] > > __se_sys_mbind+0xf76/0x1160 mm/mempolicy.c:1603 > > __x64_sys_mbind+0x78/0x90 mm/mempolicy.c:1603 > > x64_sys_call+0x2b4d/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:238 > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > do_syscall_64+0xc9/0x1c0 arch/x86/entry/common.c:83 > > entry_SYSCALL_64_after_hwframe+0x77/0x7f > > > > value changed: 0xffff888127601078 -> 0x0000000000000000 > > Note that this doesn't flip PAGE_MAPPING_MOVABLE, just some unrelated bits. > > > > > Reported-by: syzbot <syzkaller@googlegroups.com> > > Cc: stable@vger.kernel.org > > Fixes: 7e2a5e5ab217 ("mm: migrate: use __folio_test_movable()") > > Signed-off-by: Jeongjun Park <aha310510@gmail.com> > > --- > > mm/migrate.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/mm/migrate.c b/mm/migrate.c > > index 923ea80ba744..e62dac12406b 100644 > > --- a/mm/migrate.c > > +++ b/mm/migrate.c > > @@ -1118,7 +1118,7 @@ static int migrate_folio_unmap(new_folio_t get_new_folio, > > int rc = -EAGAIN; > > int old_page_state = 0; > > struct anon_vma *anon_vma = NULL; > > - bool is_lru = !__folio_test_movable(src); > > + bool is_lru; > > bool locked = false; > > bool dst_locked = false; > > @@ -1172,6 +1172,7 @@ static int migrate_folio_unmap(new_folio_t get_new_folio, > > locked = true; > > if (folio_test_mlocked(src)) > > old_page_state |= PAGE_WAS_MLOCKED; > > + is_lru = !__folio_test_movable(src); > > > Looks straight forward, though > > Acked-by: David Hildenbrand <david@redhat.com> > > > -- > Cheers, > > David / dhildenb >
> Matthew Wilcox <willy@infradead.org> wrote: > > On Mon, Sep 23, 2024 at 05:56:40PM +0200, David Hildenbrand wrote: >>> On 22.09.24 17:17, Jeongjun Park wrote: >>> I found a report from syzbot [1] >>> >>> When __folio_test_movable() is called in migrate_folio_unmap() to read >>> folio->mapping, a data race occurs because the folio is read without >>> protecting it with folio_lock. >>> >>> This can cause unintended behavior because folio->mapping is initialized >>> to a NULL value. Therefore, I think it is appropriate to call >>> __folio_test_movable() under the protection of folio_lock to prevent >>> data-race. >> >> We hold a folio reference, would we really see PAGE_MAPPING_MOVABLE flip? >> Hmm > > No; this shows a page cache folio getting truncated. It's fine; really > a false alarm from the tool. I don't think the proposed patch > introduces any problems, but it's all a bit meh. > Well, I still don't understand why it's okay to read folio->mapping without folio_lock . Since migrate_folio_unmap() is already protected by folio_lock , I think it's definitely necessary to fix it to read folio->mapping under folio_lock protection. If it were still okay to call __folio_test_movable() without folio_lock , then we could annotate data-race, but I'm still not sure if this is a good way to do it. Regards, Jeongjun Park >> Even a racing __ClearPageMovable() would still leave PAGE_MAPPING_MOVABLE >> set. >> >>> [1] >>> >>> ================================================================== >>> BUG: KCSAN: data-race in __filemap_remove_folio / migrate_pages_batch >>> >>> write to 0xffffea0004b81dd8 of 8 bytes by task 6348 on cpu 0: >>> page_cache_delete mm/filemap.c:153 [inline] >>> __filemap_remove_folio+0x1ac/0x2c0 mm/filemap.c:233 >>> filemap_remove_folio+0x6b/0x1f0 mm/filemap.c:265 >>> truncate_inode_folio+0x42/0x50 mm/truncate.c:178 >>> shmem_undo_range+0x25b/0xa70 mm/shmem.c:1028 >>> shmem_truncate_range mm/shmem.c:1144 [inline] >>> shmem_evict_inode+0x14d/0x530 mm/shmem.c:1272 >>> evict+0x2f0/0x580 fs/inode.c:731 >>> iput_final fs/inode.c:1883 [inline] >>> iput+0x42a/0x5b0 fs/inode.c:1909 >>> dentry_unlink_inode+0x24f/0x260 fs/dcache.c:412 >>> __dentry_kill+0x18b/0x4c0 fs/dcache.c:615 >>> dput+0x5c/0xd0 fs/dcache.c:857 >>> __fput+0x3fb/0x6d0 fs/file_table.c:439 >>> ____fput+0x1c/0x30 fs/file_table.c:459 >>> task_work_run+0x13a/0x1a0 kernel/task_work.c:228 >>> resume_user_mode_work include/linux/resume_user_mode.h:50 [inline] >>> exit_to_user_mode_loop kernel/entry/common.c:114 [inline] >>> exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline] >>> __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline] >>> syscall_exit_to_user_mode+0xbe/0x130 kernel/entry/common.c:218 >>> do_syscall_64+0xd6/0x1c0 arch/x86/entry/common.c:89 >>> entry_SYSCALL_64_after_hwframe+0x77/0x7f >>> >>> read to 0xffffea0004b81dd8 of 8 bytes by task 6342 on cpu 1: >>> __folio_test_movable include/linux/page-flags.h:699 [inline] >>> migrate_folio_unmap mm/migrate.c:1199 [inline] >>> migrate_pages_batch+0x24c/0x1940 mm/migrate.c:1797 >>> migrate_pages_sync mm/migrate.c:1963 [inline] >>> migrate_pages+0xff1/0x1820 mm/migrate.c:2072 >>> do_mbind mm/mempolicy.c:1390 [inline] >>> kernel_mbind mm/mempolicy.c:1533 [inline] >>> __do_sys_mbind mm/mempolicy.c:1607 [inline] >>> __se_sys_mbind+0xf76/0x1160 mm/mempolicy.c:1603 >>> __x64_sys_mbind+0x78/0x90 mm/mempolicy.c:1603 >>> x64_sys_call+0x2b4d/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:238 >>> do_syscall_x64 arch/x86/entry/common.c:52 [inline] >>> do_syscall_64+0xc9/0x1c0 arch/x86/entry/common.c:83 >>> entry_SYSCALL_64_after_hwframe+0x77/0x7f >>> >>> value changed: 0xffff888127601078 -> 0x0000000000000000 >> >> Note that this doesn't flip PAGE_MAPPING_MOVABLE, just some unrelated bits. >> >>> >>> Reported-by: syzbot <syzkaller@googlegroups.com> >>> Cc: stable@vger.kernel.org >>> Fixes: 7e2a5e5ab217 ("mm: migrate: use __folio_test_movable()") >>> Signed-off-by: Jeongjun Park <aha310510@gmail.com> >>> --- >>> mm/migrate.c | 3 ++- >>> 1 file changed, 2 insertions(+), 1 deletion(-) >>> >>> diff --git a/mm/migrate.c b/mm/migrate.c >>> index 923ea80ba744..e62dac12406b 100644 >>> --- a/mm/migrate.c >>> +++ b/mm/migrate.c >>> @@ -1118,7 +1118,7 @@ static int migrate_folio_unmap(new_folio_t get_new_folio, >>> int rc = -EAGAIN; >>> int old_page_state = 0; >>> struct anon_vma *anon_vma = NULL; >>> - bool is_lru = !__folio_test_movable(src); >>> + bool is_lru; >>> bool locked = false; >>> bool dst_locked = false; >>> @@ -1172,6 +1172,7 @@ static int migrate_folio_unmap(new_folio_t get_new_folio, >>> locked = true; >>> if (folio_test_mlocked(src)) >>> old_page_state |= PAGE_WAS_MLOCKED; >>> + is_lru = !__folio_test_movable(src); >> >> >> Looks straight forward, though >> >> Acked-by: David Hildenbrand <david@redhat.com> >> >> >> -- >> Cheers, >> >> David / dhildenb >>
On Tue, Sep 24, 2024 at 09:28:44AM +0900, Jeongjun Park wrote: > > Matthew Wilcox <willy@infradead.org> wrote: > > > > On Mon, Sep 23, 2024 at 05:56:40PM +0200, David Hildenbrand wrote: > >>> On 22.09.24 17:17, Jeongjun Park wrote: > >>> I found a report from syzbot [1] > >>> > >>> When __folio_test_movable() is called in migrate_folio_unmap() to read > >>> folio->mapping, a data race occurs because the folio is read without > >>> protecting it with folio_lock. > >>> > >>> This can cause unintended behavior because folio->mapping is initialized > >>> to a NULL value. Therefore, I think it is appropriate to call > >>> __folio_test_movable() under the protection of folio_lock to prevent > >>> data-race. > >> > >> We hold a folio reference, would we really see PAGE_MAPPING_MOVABLE flip? > >> Hmm > > > > No; this shows a page cache folio getting truncated. It's fine; really > > a false alarm from the tool. I don't think the proposed patch > > introduces any problems, but it's all a bit meh. > > > > Well, I still don't understand why it's okay to read folio->mapping > without folio_lock . Because it can't be changed in a way which changes the value of __folio_test_movable(). We have a refcount on the folio at this point, so it can't be freed. And __folio_set_movable() happens at allocation.
> Matthew Wilcox <willy@infradead.org> wrote: > > On Tue, Sep 24, 2024 at 09:28:44AM +0900, Jeongjun Park wrote: >>> Matthew Wilcox <willy@infradead.org> wrote: >>> >>> On Mon, Sep 23, 2024 at 05:56:40PM +0200, David Hildenbrand wrote: >>>>>> On 22.09.24 17:17, Jeongjun Park wrote: >>>>>> I found a report from syzbot [1] >>>>>> >>>>>> When __folio_test_movable() is called in migrate_folio_unmap() to read >>>>>> folio->mapping, a data race occurs because the folio is read without >>>>>> protecting it with folio_lock. >>>>>> >>>>>> This can cause unintended behavior because folio->mapping is initialized >>>>>> to a NULL value. Therefore, I think it is appropriate to call >>>>>> __folio_test_movable() under the protection of folio_lock to prevent >>>>>> data-race. >>>>> >>>>> We hold a folio reference, would we really see PAGE_MAPPING_MOVABLE flip? >>>>> Hmm >>> >>> No; this shows a page cache folio getting truncated. It's fine; really >>> a false alarm from the tool. I don't think the proposed patch >>> introduces any problems, but it's all a bit meh. >>> >> >> Well, I still don't understand why it's okay to read folio->mapping >> without folio_lock . > > Because it can't be changed in a way which changes the value of > __folio_test_movable(). We have a refcount on the folio at this point, > so it can't be freed. And __folio_set_movable() happens at allocation. > Thanks for the explanation. Then it seems appropriate to annotate data-race in __folio_test_movable() so that KCSAN ignores it. I will apply the change and send you a new patch. Regards, Jeongjun Park
diff --git a/mm/migrate.c b/mm/migrate.c index 923ea80ba744..e62dac12406b 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1118,7 +1118,7 @@ static int migrate_folio_unmap(new_folio_t get_new_folio, int rc = -EAGAIN; int old_page_state = 0; struct anon_vma *anon_vma = NULL; - bool is_lru = !__folio_test_movable(src); + bool is_lru; bool locked = false; bool dst_locked = false; @@ -1172,6 +1172,7 @@ static int migrate_folio_unmap(new_folio_t get_new_folio, locked = true; if (folio_test_mlocked(src)) old_page_state |= PAGE_WAS_MLOCKED; + is_lru = !__folio_test_movable(src); if (folio_test_writeback(src)) { /*
I found a report from syzbot [1] When __folio_test_movable() is called in migrate_folio_unmap() to read folio->mapping, a data race occurs because the folio is read without protecting it with folio_lock. This can cause unintended behavior because folio->mapping is initialized to a NULL value. Therefore, I think it is appropriate to call __folio_test_movable() under the protection of folio_lock to prevent data-race. [1] ================================================================== BUG: KCSAN: data-race in __filemap_remove_folio / migrate_pages_batch write to 0xffffea0004b81dd8 of 8 bytes by task 6348 on cpu 0: page_cache_delete mm/filemap.c:153 [inline] __filemap_remove_folio+0x1ac/0x2c0 mm/filemap.c:233 filemap_remove_folio+0x6b/0x1f0 mm/filemap.c:265 truncate_inode_folio+0x42/0x50 mm/truncate.c:178 shmem_undo_range+0x25b/0xa70 mm/shmem.c:1028 shmem_truncate_range mm/shmem.c:1144 [inline] shmem_evict_inode+0x14d/0x530 mm/shmem.c:1272 evict+0x2f0/0x580 fs/inode.c:731 iput_final fs/inode.c:1883 [inline] iput+0x42a/0x5b0 fs/inode.c:1909 dentry_unlink_inode+0x24f/0x260 fs/dcache.c:412 __dentry_kill+0x18b/0x4c0 fs/dcache.c:615 dput+0x5c/0xd0 fs/dcache.c:857 __fput+0x3fb/0x6d0 fs/file_table.c:439 ____fput+0x1c/0x30 fs/file_table.c:459 task_work_run+0x13a/0x1a0 kernel/task_work.c:228 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline] exit_to_user_mode_loop kernel/entry/common.c:114 [inline] exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline] __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline] syscall_exit_to_user_mode+0xbe/0x130 kernel/entry/common.c:218 do_syscall_64+0xd6/0x1c0 arch/x86/entry/common.c:89 entry_SYSCALL_64_after_hwframe+0x77/0x7f read to 0xffffea0004b81dd8 of 8 bytes by task 6342 on cpu 1: __folio_test_movable include/linux/page-flags.h:699 [inline] migrate_folio_unmap mm/migrate.c:1199 [inline] migrate_pages_batch+0x24c/0x1940 mm/migrate.c:1797 migrate_pages_sync mm/migrate.c:1963 [inline] migrate_pages+0xff1/0x1820 mm/migrate.c:2072 do_mbind mm/mempolicy.c:1390 [inline] kernel_mbind mm/mempolicy.c:1533 [inline] __do_sys_mbind mm/mempolicy.c:1607 [inline] __se_sys_mbind+0xf76/0x1160 mm/mempolicy.c:1603 __x64_sys_mbind+0x78/0x90 mm/mempolicy.c:1603 x64_sys_call+0x2b4d/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:238 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xc9/0x1c0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f value changed: 0xffff888127601078 -> 0x0000000000000000 Reported-by: syzbot <syzkaller@googlegroups.com> Cc: stable@vger.kernel.org Fixes: 7e2a5e5ab217 ("mm: migrate: use __folio_test_movable()") Signed-off-by: Jeongjun Park <aha310510@gmail.com> --- mm/migrate.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --