Message ID | 1681091102-31907-1-git-send-email-Xiaosong.Ma@unisoc.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [V2] fs: perform the check when page without mapping but page->mapping contains junk or random bitscribble | expand |
On Mon, Apr 10, 2023 at 09:45:02AM +0800, xiaosong.ma wrote: > perform the check in dump_mapping() to print warning info and avoid crash with invalid non-NULL page->mapping. > For example, a panic with following backtraces show dump_page will show wrong info and panic when the bad page > is non-NULL mapping and page->mapping is 0x80000000000. > > crash_arm64> bt > PID: 232 TASK: ffffff80e8c2c340 CPU: 0 COMMAND: "Binder:232_2" > #0 [ffffffc013e5b080] sysdump_panic_event$b2bce43a479f4f7762201bfee02d7889 at ffffffc0108d7c2c > #1 [ffffffc013e5b0c0] atomic_notifier_call_chain at ffffffc010300228 > #2 [ffffffc013e5b2c0] panic at ffffffc0102c926c > #3 [ffffffc013e5b370] die at ffffffc010267670 > #4 [ffffffc013e5b3a0] die_kernel_fault at ffffffc0102808a4 > #5 [ffffffc013e5b3d0] __do_kernel_fault at ffffffc010280820 > #6 [ffffffc013e5b410] do_bad_area at ffffffc01028059c > #7 [ffffffc013e5b440] do_translation_fault$4df5decbea5d08a63349aa36f07426b2 at ffffffc0111149c8 > #8 [ffffffc013e5b470] do_mem_abort at ffffffc0100a4488 > #9 [ffffffc013e5b5e0] el1_ia at ffffffc0100a6c00 > #10 [ffffffc013e5b5f0] __dump_page at ffffffc0104beecc This doesn't show a crash in dump_mapping(), it shows a crash in __dump_page(). > diff --git a/fs/inode.c b/fs/inode.c > index f453eb5..c9021e5 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -564,7 +564,8 @@ void dump_mapping(const struct address_space *mapping) > * If mapping is an invalid pointer, we don't want to crash > * accessing it, so probe everything depending on it carefully. > */ > - if (get_kernel_nofault(host, &mapping->host) || > + if (get_kernel_nofault(mapping, &mapping) || > + get_kernel_nofault(host, &mapping->host) || This patch makes no sense. Essentially, you're saying mapping = &mapping which is obviously wrong.
On Tue, 11 Apr 2023 13:16:18 +0100 Matthew Wilcox <willy@infradead.org> wrote: > On Mon, Apr 10, 2023 at 09:45:02AM +0800, xiaosong.ma wrote: > > perform the check in dump_mapping() to print warning info and avoid crash with invalid non-NULL page->mapping. > > For example, a panic with following backtraces show dump_page will show wrong info and panic when the bad page > > is non-NULL mapping and page->mapping is 0x80000000000. > > > > crash_arm64> bt > > PID: 232 TASK: ffffff80e8c2c340 CPU: 0 COMMAND: "Binder:232_2" > > #0 [ffffffc013e5b080] sysdump_panic_event$b2bce43a479f4f7762201bfee02d7889 at ffffffc0108d7c2c > > #1 [ffffffc013e5b0c0] atomic_notifier_call_chain at ffffffc010300228 > > #2 [ffffffc013e5b2c0] panic at ffffffc0102c926c > > #3 [ffffffc013e5b370] die at ffffffc010267670 > > #4 [ffffffc013e5b3a0] die_kernel_fault at ffffffc0102808a4 > > #5 [ffffffc013e5b3d0] __do_kernel_fault at ffffffc010280820 > > #6 [ffffffc013e5b410] do_bad_area at ffffffc01028059c > > #7 [ffffffc013e5b440] do_translation_fault$4df5decbea5d08a63349aa36f07426b2 at ffffffc0111149c8 > > #8 [ffffffc013e5b470] do_mem_abort at ffffffc0100a4488 > > #9 [ffffffc013e5b5e0] el1_ia at ffffffc0100a6c00 > > #10 [ffffffc013e5b5f0] __dump_page at ffffffc0104beecc > > This doesn't show a crash in dump_mapping(), it shows a crash in > __dump_page(). um, yes. But if page->mapping is corrupted, where does __dump_page() dereference it? The initial patch (https://lkml.kernel.org/r/1680587425-4683-1-git-send-email-Xiaosong.Ma@unisoc.com) prevented __dump_page() from calling dump_mapping() if page->mapping is bad, and that presumably fixed things. > > diff --git a/fs/inode.c b/fs/inode.c > > index f453eb5..c9021e5 100644 > > --- a/fs/inode.c > > +++ b/fs/inode.c > > @@ -564,7 +564,8 @@ void dump_mapping(const struct address_space *mapping) > > * If mapping is an invalid pointer, we don't want to crash > > * accessing it, so probe everything depending on it carefully. > > */ > > - if (get_kernel_nofault(host, &mapping->host) || > > + if (get_kernel_nofault(mapping, &mapping) || > > + get_kernel_nofault(host, &mapping->host) || > > This patch makes no sense. Essentially, you're saying > mapping = &mapping > which is obviously wrong. We're checking for mapping==junk, so this could be get_kernel_nofault(tmp, mapping) or go direct to copy_from_kernel_nofault(). We used to have a probe_kernel_address() for this... So confusion reigns. I think making dump_mapping() tolerant of a wild mapping pointer makes sense, but I don't think we actually know why the reporter's kernel crashed.
On Tue, Apr 11, 2023 at 05:15:36PM -0700, Andrew Morton wrote: > On Tue, 11 Apr 2023 13:16:18 +0100 Matthew Wilcox <willy@infradead.org> wrote: > > > On Mon, Apr 10, 2023 at 09:45:02AM +0800, xiaosong.ma wrote: > > > perform the check in dump_mapping() to print warning info and avoid crash with invalid non-NULL page->mapping. > > > For example, a panic with following backtraces show dump_page will show wrong info and panic when the bad page > > > is non-NULL mapping and page->mapping is 0x80000000000. > > > > > > crash_arm64> bt > > > PID: 232 TASK: ffffff80e8c2c340 CPU: 0 COMMAND: "Binder:232_2" > > > #0 [ffffffc013e5b080] sysdump_panic_event$b2bce43a479f4f7762201bfee02d7889 at ffffffc0108d7c2c > > > #1 [ffffffc013e5b0c0] atomic_notifier_call_chain at ffffffc010300228 > > > #2 [ffffffc013e5b2c0] panic at ffffffc0102c926c > > > #3 [ffffffc013e5b370] die at ffffffc010267670 > > > #4 [ffffffc013e5b3a0] die_kernel_fault at ffffffc0102808a4 > > > #5 [ffffffc013e5b3d0] __do_kernel_fault at ffffffc010280820 > > > #6 [ffffffc013e5b410] do_bad_area at ffffffc01028059c > > > #7 [ffffffc013e5b440] do_translation_fault$4df5decbea5d08a63349aa36f07426b2 at ffffffc0111149c8 > > > #8 [ffffffc013e5b470] do_mem_abort at ffffffc0100a4488 > > > #9 [ffffffc013e5b5e0] el1_ia at ffffffc0100a6c00 > > > #10 [ffffffc013e5b5f0] __dump_page at ffffffc0104beecc > > > > This doesn't show a crash in dump_mapping(), it shows a crash in > > __dump_page(). > > um, yes. > > But if page->mapping is corrupted, where does __dump_page() dereference it? I don't see anywhere that it does, so I'm suspicious that we have the correct diagnosis here. > The initial patch > (https://lkml.kernel.org/r/1680587425-4683-1-git-send-email-Xiaosong.Ma@unisoc.com) > prevented __dump_page() from calling dump_mapping() if page->mapping is > bad, and that presumably fixed things. Right, but doesn't the _existing_ get_kernel_nofault(host, &mapping->host) already prevent us from blindly dereferencing a bad mapping pointer? > > > - if (get_kernel_nofault(host, &mapping->host) || > > > + if (get_kernel_nofault(mapping, &mapping) || > > > + get_kernel_nofault(host, &mapping->host) || > > > > This patch makes no sense. Essentially, you're saying > > mapping = &mapping > > which is obviously wrong. > > We're checking for mapping==junk, so this could be > > get_kernel_nofault(tmp, mapping) Why will that be better than get_kernel_nofault(host, &mapping->host)? I see no tangible difference between get_kernel_nofault(0x8000'0000) and get_kernel_nofault(0x8000'0084) (or whatever the offset is). > or go direct to copy_from_kernel_nofault(). We used to have a > probe_kernel_address() for this... > > So confusion reigns. I think making dump_mapping() tolerant of a wild > mapping pointer makes sense, but I don't think we actually know why the > reporter's kernel crashed. In my mind dump_mapping() is already tolerant of a wild page->mapping pointer. I think the problem is something entirely different.
Just looking at this and the backtrace: > On Apr 12, 2023, at 09:14, Matthew Wilcox <willy@infradead.org> wrote: > > On Tue, Apr 11, 2023 at 05:15:36PM -0700, Andrew Morton wrote: >> On Tue, 11 Apr 2023 13:16:18 +0100 Matthew Wilcox <willy@infradead.org> wrote: >> >>> On Mon, Apr 10, 2023 at 09:45:02AM +0800, xiaosong.ma wrote: >>>> perform the check in dump_mapping() to print warning info and avoid crash with invalid non-NULL page->mapping. >>>> For example, a panic with following backtraces show dump_page will show wrong info and panic when the bad page >>>> is non-NULL mapping and page->mapping is 0x80000000000. >>>> >>>> crash_arm64> bt >>>> PID: 232 TASK: ffffff80e8c2c340 CPU: 0 COMMAND: "Binder:232_2" >>>> #0 [ffffffc013e5b080] sysdump_panic_event$b2bce43a479f4f7762201bfee02d7889 at ffffffc0108d7c2c >>>> #1 [ffffffc013e5b0c0] atomic_notifier_call_chain at ffffffc010300228 >>>> #2 [ffffffc013e5b2c0] panic at ffffffc0102c926c >>>> #3 [ffffffc013e5b370] die at ffffffc010267670 >>>> #4 [ffffffc013e5b3a0] die_kernel_fault at ffffffc0102808a4 >>>> #5 [ffffffc013e5b3d0] __do_kernel_fault at ffffffc010280820 >>>> #6 [ffffffc013e5b410] do_bad_area at ffffffc01028059c >>>> #7 [ffffffc013e5b440] do_translation_fault$4df5decbea5d08a63349aa36f07426b2 at ffffffc0111149c8 >>>> #8 [ffffffc013e5b470] do_mem_abort at ffffffc0100a4488 >>>> #9 [ffffffc013e5b5e0] el1_ia at ffffffc0100a6c00 >>>> #10 [ffffffc013e5b5f0] __dump_page at ffffffc0104beecc >>> >>> This doesn't show a crash in dump_mapping(), it shows a crash in >>> __dump_page(). >> >> um, yes. >> >> But if page->mapping is corrupted, where does __dump_page() dereference it? > > I don't see anywhere that it does, so I'm suspicious that we have the > correct diagnosis here. I agree; since dump_mapping() is an actual function rather than a macro or inline, if a bad dereference were happening within dump_mapping() I would think we SHOULD see the call to dump_mapping() on the stack unless I'm missing something obvious here. Instead I'd like to know which instruction the faulting address in __dump_page() maps to for the kernel experiencing this. >> The initial patch >> (https://lkml.kernel.org/r/1680587425-4683-1-git-send-email-Xiaosong.Ma@unisoc.com) >> prevented __dump_page() from calling dump_mapping() if page->mapping is >> bad, and that presumably fixed things. > > Right, but doesn't the _existing_ get_kernel_nofault(host, &mapping->host) > already prevent us from blindly dereferencing a bad mapping pointer? I would think it would, but given the traceback, is the fault occurring within dump_mapping(), or have we perhaps completed dump_mapping() and some subtle corruption occurred such that the fault occurs on the return to __dump_page()? Certainly dump_mapping() looks to do the right thing to avoid using a bad passed "mapping" as it's not dereferenced anywhere without checks, just used for pointer math to create an address for calls to get_kernel_notfault(). >> So confusion reigns. I think making dump_mapping() tolerant of a wild >> mapping pointer makes sense, but I don't think we actually know why the >> reporter's kernel crashed. > > In my mind dump_mapping() is already tolerant of a wild page->mapping > pointer. I think the problem is something entirely different. Again, I agree. As posited above, could it be that something occurs within dump_mapping() such that when the code returns to __dump_page() it is at THAT point that the fault occurs? That would explain the backtrace and why it shows the fault as occurring within __dump_page(), but upon first glance the mechanism by which this could be occurring eludes me. The original patch doesn't mention whether any pr_warn() messages were printed as a result of the call to dump_mapping(), and the suggested fix would fix the issue whether the fault were occurring within dump_mapping() or in the return from calling dump_mapping(). -- Bill
diff --git a/fs/inode.c b/fs/inode.c index f453eb5..c9021e5 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -564,7 +564,8 @@ void dump_mapping(const struct address_space *mapping) * If mapping is an invalid pointer, we don't want to crash * accessing it, so probe everything depending on it carefully. */ - if (get_kernel_nofault(host, &mapping->host) || + if (get_kernel_nofault(mapping, &mapping) || + get_kernel_nofault(host, &mapping->host) || get_kernel_nofault(a_ops, &mapping->a_ops)) { pr_warn("invalid mapping:%px\n", mapping); return;
perform the check in dump_mapping() to print warning info and avoid crash with invalid non-NULL page->mapping. For example, a panic with following backtraces show dump_page will show wrong info and panic when the bad page is non-NULL mapping and page->mapping is 0x80000000000. crash_arm64> bt PID: 232 TASK: ffffff80e8c2c340 CPU: 0 COMMAND: "Binder:232_2" #0 [ffffffc013e5b080] sysdump_panic_event$b2bce43a479f4f7762201bfee02d7889 at ffffffc0108d7c2c #1 [ffffffc013e5b0c0] atomic_notifier_call_chain at ffffffc010300228 #2 [ffffffc013e5b2c0] panic at ffffffc0102c926c #3 [ffffffc013e5b370] die at ffffffc010267670 #4 [ffffffc013e5b3a0] die_kernel_fault at ffffffc0102808a4 #5 [ffffffc013e5b3d0] __do_kernel_fault at ffffffc010280820 #6 [ffffffc013e5b410] do_bad_area at ffffffc01028059c #7 [ffffffc013e5b440] do_translation_fault$4df5decbea5d08a63349aa36f07426b2 at ffffffc0111149c8 #8 [ffffffc013e5b470] do_mem_abort at ffffffc0100a4488 #9 [ffffffc013e5b5e0] el1_ia at ffffffc0100a6c00 #10 [ffffffc013e5b5f0] __dump_page at ffffffc0104beecc #11 [ffffffc013e5b630] bad_page at ffffffc0104e6ffc #12 [ffffffc013e5b820] rmqueue_bulk at ffffffc0104e9128 #13 [ffffffc013e5b950] rmqueue at ffffffc0104e7c3c #14 [ffffffc013e5b9c0] get_page_from_freelist at ffffffc0104e3e3c #15 [ffffffc013e5ba50] __alloc_pages_nodemask at ffffffc0104e3a7c #16 [ffffffc013e5bac0] pagecache_get_page at ffffffc01047d0e4 #17 [ffffffc013e5bb20] grab_cache_page_write_begin at ffffffc010480e3c #18 [ffffffc013e5bb50] block_write_begin at ffffffc010586204 #19 [ffffffc013e5bb90] blkdev_write_begin$75b353f60767e771433fc3b19ba260ab at ffffffc01058cc48 #20 [ffffffc013e5bc00] generic_perform_write at ffffffc010480f1c #21 [ffffffc013e5bc60] __generic_file_write_iter at ffffffc01048115c #22 [ffffffc013e5bcf0] blkdev_write_iter at ffffffc01058c0a8 #23 [ffffffc013e5bda0] __vfs_write at ffffffc01052d808 #24 [ffffffc013e5bdd0] vfs_write at ffffffc01052da5c #25 [ffffffc013e5be30] __arm64_sys_pwrite64 at ffffffc01052e09c #26 [ffffffc013e5be60] el0_svc_common at ffffffc010272224 #27 [ffffffc013e5bea0] el0_svc_handler at ffffffc010272148 #28 [ffffffc013e5bff0] el0_svc at ffffffc0100a7ec4 Signed-off-by: xiaosong.ma <Xiaosong.Ma@unisoc.com> --- fs/inode.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)