diff mbox series

[V2] fs: perform the check when page without mapping but page->mapping contains junk or random bitscribble

Message ID 1681091102-31907-1-git-send-email-Xiaosong.Ma@unisoc.com (mailing list archive)
State New
Headers show
Series [V2] fs: perform the check when page without mapping but page->mapping contains junk or random bitscribble | expand

Commit Message

xiaosong.ma April 10, 2023, 1:45 a.m. UTC
perform the check in dump_mapping() to print warning info and avoid crash with invalid non-NULL page->mapping.
For example, a panic with following backtraces show dump_page will show wrong info and panic when the bad page
is non-NULL mapping and page->mapping is 0x80000000000.

    crash_arm64> bt
    PID: 232    TASK: ffffff80e8c2c340  CPU: 0   COMMAND: "Binder:232_2"
     #0 [ffffffc013e5b080] sysdump_panic_event$b2bce43a479f4f7762201bfee02d7889 at ffffffc0108d7c2c
     #1 [ffffffc013e5b0c0] atomic_notifier_call_chain at ffffffc010300228
     #2 [ffffffc013e5b2c0] panic at ffffffc0102c926c
     #3 [ffffffc013e5b370] die at ffffffc010267670
     #4 [ffffffc013e5b3a0] die_kernel_fault at ffffffc0102808a4
     #5 [ffffffc013e5b3d0] __do_kernel_fault at ffffffc010280820
     #6 [ffffffc013e5b410] do_bad_area at ffffffc01028059c
     #7 [ffffffc013e5b440] do_translation_fault$4df5decbea5d08a63349aa36f07426b2 at ffffffc0111149c8
     #8 [ffffffc013e5b470] do_mem_abort at ffffffc0100a4488
     #9 [ffffffc013e5b5e0] el1_ia at ffffffc0100a6c00
     #10 [ffffffc013e5b5f0] __dump_page at ffffffc0104beecc
     #11 [ffffffc013e5b630] bad_page at ffffffc0104e6ffc
     #12 [ffffffc013e5b820] rmqueue_bulk at ffffffc0104e9128
     #13 [ffffffc013e5b950] rmqueue at ffffffc0104e7c3c
     #14 [ffffffc013e5b9c0] get_page_from_freelist at ffffffc0104e3e3c
     #15 [ffffffc013e5ba50] __alloc_pages_nodemask at ffffffc0104e3a7c
     #16 [ffffffc013e5bac0] pagecache_get_page at ffffffc01047d0e4
     #17 [ffffffc013e5bb20] grab_cache_page_write_begin at ffffffc010480e3c
     #18 [ffffffc013e5bb50] block_write_begin at ffffffc010586204
     #19 [ffffffc013e5bb90] blkdev_write_begin$75b353f60767e771433fc3b19ba260ab at ffffffc01058cc48
     #20 [ffffffc013e5bc00] generic_perform_write at ffffffc010480f1c
     #21 [ffffffc013e5bc60] __generic_file_write_iter at ffffffc01048115c
     #22 [ffffffc013e5bcf0] blkdev_write_iter at ffffffc01058c0a8
     #23 [ffffffc013e5bda0] __vfs_write at ffffffc01052d808
     #24 [ffffffc013e5bdd0] vfs_write at ffffffc01052da5c
     #25 [ffffffc013e5be30] __arm64_sys_pwrite64 at ffffffc01052e09c
     #26 [ffffffc013e5be60] el0_svc_common at ffffffc010272224
     #27 [ffffffc013e5bea0] el0_svc_handler at ffffffc010272148
     #28 [ffffffc013e5bff0] el0_svc at ffffffc0100a7ec4

Signed-off-by: xiaosong.ma <Xiaosong.Ma@unisoc.com>
---
 fs/inode.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Matthew Wilcox April 11, 2023, 12:16 p.m. UTC | #1
On Mon, Apr 10, 2023 at 09:45:02AM +0800, xiaosong.ma wrote:
> perform the check in dump_mapping() to print warning info and avoid crash with invalid non-NULL page->mapping.
> For example, a panic with following backtraces show dump_page will show wrong info and panic when the bad page
> is non-NULL mapping and page->mapping is 0x80000000000.
> 
>     crash_arm64> bt
>     PID: 232    TASK: ffffff80e8c2c340  CPU: 0   COMMAND: "Binder:232_2"
>      #0 [ffffffc013e5b080] sysdump_panic_event$b2bce43a479f4f7762201bfee02d7889 at ffffffc0108d7c2c
>      #1 [ffffffc013e5b0c0] atomic_notifier_call_chain at ffffffc010300228
>      #2 [ffffffc013e5b2c0] panic at ffffffc0102c926c
>      #3 [ffffffc013e5b370] die at ffffffc010267670
>      #4 [ffffffc013e5b3a0] die_kernel_fault at ffffffc0102808a4
>      #5 [ffffffc013e5b3d0] __do_kernel_fault at ffffffc010280820
>      #6 [ffffffc013e5b410] do_bad_area at ffffffc01028059c
>      #7 [ffffffc013e5b440] do_translation_fault$4df5decbea5d08a63349aa36f07426b2 at ffffffc0111149c8
>      #8 [ffffffc013e5b470] do_mem_abort at ffffffc0100a4488
>      #9 [ffffffc013e5b5e0] el1_ia at ffffffc0100a6c00
>      #10 [ffffffc013e5b5f0] __dump_page at ffffffc0104beecc

This doesn't show a crash in dump_mapping(), it shows a crash in
__dump_page().

> diff --git a/fs/inode.c b/fs/inode.c
> index f453eb5..c9021e5 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -564,7 +564,8 @@ void dump_mapping(const struct address_space *mapping)
>  	 * If mapping is an invalid pointer, we don't want to crash
>  	 * accessing it, so probe everything depending on it carefully.
>  	 */
> -	if (get_kernel_nofault(host, &mapping->host) ||
> +	if (get_kernel_nofault(mapping, &mapping) ||
> +	    get_kernel_nofault(host, &mapping->host) ||

This patch makes no sense.  Essentially, you're saying
	mapping = &mapping
which is obviously wrong.
Andrew Morton April 12, 2023, 12:15 a.m. UTC | #2
On Tue, 11 Apr 2023 13:16:18 +0100 Matthew Wilcox <willy@infradead.org> wrote:

> On Mon, Apr 10, 2023 at 09:45:02AM +0800, xiaosong.ma wrote:
> > perform the check in dump_mapping() to print warning info and avoid crash with invalid non-NULL page->mapping.
> > For example, a panic with following backtraces show dump_page will show wrong info and panic when the bad page
> > is non-NULL mapping and page->mapping is 0x80000000000.
> > 
> >     crash_arm64> bt
> >     PID: 232    TASK: ffffff80e8c2c340  CPU: 0   COMMAND: "Binder:232_2"
> >      #0 [ffffffc013e5b080] sysdump_panic_event$b2bce43a479f4f7762201bfee02d7889 at ffffffc0108d7c2c
> >      #1 [ffffffc013e5b0c0] atomic_notifier_call_chain at ffffffc010300228
> >      #2 [ffffffc013e5b2c0] panic at ffffffc0102c926c
> >      #3 [ffffffc013e5b370] die at ffffffc010267670
> >      #4 [ffffffc013e5b3a0] die_kernel_fault at ffffffc0102808a4
> >      #5 [ffffffc013e5b3d0] __do_kernel_fault at ffffffc010280820
> >      #6 [ffffffc013e5b410] do_bad_area at ffffffc01028059c
> >      #7 [ffffffc013e5b440] do_translation_fault$4df5decbea5d08a63349aa36f07426b2 at ffffffc0111149c8
> >      #8 [ffffffc013e5b470] do_mem_abort at ffffffc0100a4488
> >      #9 [ffffffc013e5b5e0] el1_ia at ffffffc0100a6c00
> >      #10 [ffffffc013e5b5f0] __dump_page at ffffffc0104beecc
> 
> This doesn't show a crash in dump_mapping(), it shows a crash in
> __dump_page().

um, yes.

But if page->mapping is corrupted, where does __dump_page() dereference it?

The initial patch
(https://lkml.kernel.org/r/1680587425-4683-1-git-send-email-Xiaosong.Ma@unisoc.com)
prevented __dump_page() from calling dump_mapping() if page->mapping is
bad, and that presumably fixed things.


> > diff --git a/fs/inode.c b/fs/inode.c
> > index f453eb5..c9021e5 100644
> > --- a/fs/inode.c
> > +++ b/fs/inode.c
> > @@ -564,7 +564,8 @@ void dump_mapping(const struct address_space *mapping)
> >  	 * If mapping is an invalid pointer, we don't want to crash
> >  	 * accessing it, so probe everything depending on it carefully.
> >  	 */
> > -	if (get_kernel_nofault(host, &mapping->host) ||
> > +	if (get_kernel_nofault(mapping, &mapping) ||
> > +	    get_kernel_nofault(host, &mapping->host) ||
> 
> This patch makes no sense.  Essentially, you're saying
> 	mapping = &mapping
> which is obviously wrong.

We're checking for mapping==junk, so this could be 

	get_kernel_nofault(tmp, mapping)

or go direct to copy_from_kernel_nofault().  We used to have a
probe_kernel_address() for this...

So confusion reigns.  I think making dump_mapping() tolerant of a wild
mapping pointer makes sense, but I don't think we actually know why the
reporter's kernel crashed.
Matthew Wilcox April 12, 2023, 3:14 p.m. UTC | #3
On Tue, Apr 11, 2023 at 05:15:36PM -0700, Andrew Morton wrote:
> On Tue, 11 Apr 2023 13:16:18 +0100 Matthew Wilcox <willy@infradead.org> wrote:
> 
> > On Mon, Apr 10, 2023 at 09:45:02AM +0800, xiaosong.ma wrote:
> > > perform the check in dump_mapping() to print warning info and avoid crash with invalid non-NULL page->mapping.
> > > For example, a panic with following backtraces show dump_page will show wrong info and panic when the bad page
> > > is non-NULL mapping and page->mapping is 0x80000000000.
> > > 
> > >     crash_arm64> bt
> > >     PID: 232    TASK: ffffff80e8c2c340  CPU: 0   COMMAND: "Binder:232_2"
> > >      #0 [ffffffc013e5b080] sysdump_panic_event$b2bce43a479f4f7762201bfee02d7889 at ffffffc0108d7c2c
> > >      #1 [ffffffc013e5b0c0] atomic_notifier_call_chain at ffffffc010300228
> > >      #2 [ffffffc013e5b2c0] panic at ffffffc0102c926c
> > >      #3 [ffffffc013e5b370] die at ffffffc010267670
> > >      #4 [ffffffc013e5b3a0] die_kernel_fault at ffffffc0102808a4
> > >      #5 [ffffffc013e5b3d0] __do_kernel_fault at ffffffc010280820
> > >      #6 [ffffffc013e5b410] do_bad_area at ffffffc01028059c
> > >      #7 [ffffffc013e5b440] do_translation_fault$4df5decbea5d08a63349aa36f07426b2 at ffffffc0111149c8
> > >      #8 [ffffffc013e5b470] do_mem_abort at ffffffc0100a4488
> > >      #9 [ffffffc013e5b5e0] el1_ia at ffffffc0100a6c00
> > >      #10 [ffffffc013e5b5f0] __dump_page at ffffffc0104beecc
> > 
> > This doesn't show a crash in dump_mapping(), it shows a crash in
> > __dump_page().
> 
> um, yes.
> 
> But if page->mapping is corrupted, where does __dump_page() dereference it?

I don't see anywhere that it does, so I'm suspicious that we have the
correct diagnosis here.

> The initial patch
> (https://lkml.kernel.org/r/1680587425-4683-1-git-send-email-Xiaosong.Ma@unisoc.com)
> prevented __dump_page() from calling dump_mapping() if page->mapping is
> bad, and that presumably fixed things.

Right, but doesn't the _existing_ get_kernel_nofault(host, &mapping->host)
already prevent us from blindly dereferencing a bad mapping pointer?

> > > -	if (get_kernel_nofault(host, &mapping->host) ||
> > > +	if (get_kernel_nofault(mapping, &mapping) ||
> > > +	    get_kernel_nofault(host, &mapping->host) ||
> > 
> > This patch makes no sense.  Essentially, you're saying
> > 	mapping = &mapping
> > which is obviously wrong.
> 
> We're checking for mapping==junk, so this could be 
> 
> 	get_kernel_nofault(tmp, mapping)

Why will that be better than get_kernel_nofault(host, &mapping->host)?
I see no tangible difference between get_kernel_nofault(0x8000'0000) and
get_kernel_nofault(0x8000'0084) (or whatever the offset is).

> or go direct to copy_from_kernel_nofault().  We used to have a
> probe_kernel_address() for this...
> 
> So confusion reigns.  I think making dump_mapping() tolerant of a wild
> mapping pointer makes sense, but I don't think we actually know why the
> reporter's kernel crashed.

In my mind dump_mapping() is already tolerant of a wild page->mapping
pointer.  I think the problem is something entirely different.
William Kucharski April 13, 2023, 7:50 a.m. UTC | #4
Just looking at this and the backtrace:

> On Apr 12, 2023, at 09:14, Matthew Wilcox <willy@infradead.org> wrote:
> 
> On Tue, Apr 11, 2023 at 05:15:36PM -0700, Andrew Morton wrote:
>> On Tue, 11 Apr 2023 13:16:18 +0100 Matthew Wilcox <willy@infradead.org> wrote:
>> 
>>> On Mon, Apr 10, 2023 at 09:45:02AM +0800, xiaosong.ma wrote:
>>>> perform the check in dump_mapping() to print warning info and avoid crash with invalid non-NULL page->mapping.
>>>> For example, a panic with following backtraces show dump_page will show wrong info and panic when the bad page
>>>> is non-NULL mapping and page->mapping is 0x80000000000.
>>>> 
>>>>    crash_arm64> bt
>>>>    PID: 232    TASK: ffffff80e8c2c340  CPU: 0   COMMAND: "Binder:232_2"
>>>>     #0 [ffffffc013e5b080] sysdump_panic_event$b2bce43a479f4f7762201bfee02d7889 at ffffffc0108d7c2c
>>>>     #1 [ffffffc013e5b0c0] atomic_notifier_call_chain at ffffffc010300228
>>>>     #2 [ffffffc013e5b2c0] panic at ffffffc0102c926c
>>>>     #3 [ffffffc013e5b370] die at ffffffc010267670
>>>>     #4 [ffffffc013e5b3a0] die_kernel_fault at ffffffc0102808a4
>>>>     #5 [ffffffc013e5b3d0] __do_kernel_fault at ffffffc010280820
>>>>     #6 [ffffffc013e5b410] do_bad_area at ffffffc01028059c
>>>>     #7 [ffffffc013e5b440] do_translation_fault$4df5decbea5d08a63349aa36f07426b2 at ffffffc0111149c8
>>>>     #8 [ffffffc013e5b470] do_mem_abort at ffffffc0100a4488
>>>>     #9 [ffffffc013e5b5e0] el1_ia at ffffffc0100a6c00
>>>>     #10 [ffffffc013e5b5f0] __dump_page at ffffffc0104beecc
>>> 
>>> This doesn't show a crash in dump_mapping(), it shows a crash in
>>> __dump_page().
>> 
>> um, yes.
>> 
>> But if page->mapping is corrupted, where does __dump_page() dereference it?
> 
> I don't see anywhere that it does, so I'm suspicious that we have the
> correct diagnosis here.

I agree; since dump_mapping() is an actual function rather than a macro or
inline, if a bad dereference were happening within dump_mapping() I would think
we SHOULD see the call to dump_mapping() on the stack unless I'm missing
something obvious here.

Instead I'd like to know which instruction the faulting address in __dump_page()
maps to for the kernel experiencing this.

>> The initial patch
>> (https://lkml.kernel.org/r/1680587425-4683-1-git-send-email-Xiaosong.Ma@unisoc.com)
>> prevented __dump_page() from calling dump_mapping() if page->mapping is
>> bad, and that presumably fixed things.
> 
> Right, but doesn't the _existing_ get_kernel_nofault(host, &mapping->host)
> already prevent us from blindly dereferencing a bad mapping pointer?

I would think it would, but given the traceback, is the fault occurring
within dump_mapping(), or have we perhaps completed dump_mapping() and some
subtle corruption occurred such that the fault occurs on the return to
__dump_page()?

Certainly dump_mapping() looks to do the right thing to avoid using a bad
passed "mapping" as it's not dereferenced anywhere without checks, just
used for pointer math to create an address for calls to get_kernel_notfault().

>> So confusion reigns.  I think making dump_mapping() tolerant of a wild
>> mapping pointer makes sense, but I don't think we actually know why the
>> reporter's kernel crashed.
> 
> In my mind dump_mapping() is already tolerant of a wild page->mapping
> pointer.  I think the problem is something entirely different.

Again, I agree.

As posited above, could it be that something occurs within dump_mapping()
such that when the code returns to __dump_page() it is at THAT point that
the fault occurs? That would explain the backtrace and why it shows the
fault as occurring within __dump_page(), but upon first glance the
mechanism by which this could be occurring eludes me.

The original patch doesn't mention whether any pr_warn() messages were
printed as a result of the call to dump_mapping(), and the suggested fix
would fix the issue whether the fault were occurring within dump_mapping() or
in the return from calling dump_mapping().

    -- Bill
diff mbox series

Patch

diff --git a/fs/inode.c b/fs/inode.c
index f453eb5..c9021e5 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -564,7 +564,8 @@  void dump_mapping(const struct address_space *mapping)
 	 * If mapping is an invalid pointer, we don't want to crash
 	 * accessing it, so probe everything depending on it carefully.
 	 */
-	if (get_kernel_nofault(host, &mapping->host) ||
+	if (get_kernel_nofault(mapping, &mapping) ||
+	    get_kernel_nofault(host, &mapping->host) ||
 	    get_kernel_nofault(a_ops, &mapping->a_ops)) {
 		pr_warn("invalid mapping:%px\n", mapping);
 		return;