diff mbox series

[1/1] pstore/ram: Check member of buffers during the initialization phase of the pstore

Message ID 20230801060432.1307717-1-yunlong.xing@unisoc.com (mailing list archive)
State Mainlined
Commit fe8c3623ab06603eb760444a032d426542212021
Headers show
Series [1/1] pstore/ram: Check member of buffers during the initialization phase of the pstore | expand

Commit Message

Yunlong Xing Aug. 1, 2023, 6:04 a.m. UTC
From: Enlin Mu <enlin.mu@unisoc.com>

The commit 30696378f68a("pstore/ram: Do not treat empty buffers as valid")
would introduce the following issue:

When finding the buffer_size is zero, it would return directly.However, at
the same time, if the buffer's start is a illegal value, the others would
panic if access the buffer.

To avoid these happenning, check if the members are legal during the
initialization phase of the pstore.

Fixes: 30696378f68a ("pstore/ram: Do not treat empty buffers as valid")
Cc: stable@vger.kernel.org
Signed-off-by: Enlin Mu <enlin.mu@unisoc.com>
---
 fs/pstore/ram_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Guilherme G. Piccoli Aug. 2, 2023, 3:46 p.m. UTC | #1
On 01/08/2023 03:04, Yunlong Xing wrote:
> [...]

> diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c
> index 85aaf0fc6d7d..eb6df190d752 100644
> --- a/fs/pstore/ram_core.c
> +++ b/fs/pstore/ram_core.c
> @@ -519,7 +519,7 @@ static int persistent_ram_post_init(struct persistent_ram_zone *prz, u32 sig,
>  	sig ^= PERSISTENT_RAM_SIG;
>  
>  	if (prz->buffer->sig == sig) {
> -		if (buffer_size(prz) == 0) {
> +		if (buffer_size(prz) == 0 && buffer_start(prz) == 0) {
>  			pr_debug("found existing empty buffer\n");

Thanks for the patch! I'd also adjust the above print statement to
reflect the different paths (empty buffers vs illegal one) and maybe
bump it to pr_info, or even pr_warn(_once?).

What do you all think, makes sense or could we pollute dmesg too much?
Cheers!
yunlong xing Aug. 3, 2023, 10:43 a.m. UTC | #2
On Wed, Aug 2, 2023 at 11:47 PM Guilherme G. Piccoli
<gpiccoli@igalia.com> wrote:
>
> On 01/08/2023 03:04, Yunlong Xing wrote:
> > [...]
>
> > diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c
> > index 85aaf0fc6d7d..eb6df190d752 100644
> > --- a/fs/pstore/ram_core.c
> > +++ b/fs/pstore/ram_core.c
> > @@ -519,7 +519,7 @@ static int persistent_ram_post_init(struct persistent_ram_zone *prz, u32 sig,
> >       sig ^= PERSISTENT_RAM_SIG;
> >
> >       if (prz->buffer->sig == sig) {
> > -             if (buffer_size(prz) == 0) {
> > +             if (buffer_size(prz) == 0 && buffer_start(prz) == 0) {
> >                       pr_debug("found existing empty buffer\n");
>
> Thanks for the patch! I'd also adjust the above print statement to
> reflect the different paths (empty buffers vs illegal one) and maybe
> bump it to pr_info, or even pr_warn(_once?).

I don't think it's necessary.
Through testing, the probability of this situation occurring is still very low.
Using 20 devices, reboot every 10 seconds, and only once after 96 hours.
The panic stack information is as follows:

case A:
 sysdump_panic_event+0x3b4/0x5b8
 atomic_notifier_call_chain+0x54/0x90
 panic+0x1c8/0x42c
 die+0x29c/0x2a8
 die_kernel_fault+0x68/0x78
 __do_kernel_fault+0x1c4/0x1e0
 do_bad_area+0x40/0x100
 do_translation_fault+0x68/0x80
 do_mem_abort+0x68/0xf8
 el1_da+0x1c/0xc0
 __raw_writeb+0x38/0x174
 __memcpy_toio+0x40/0xac
 persistent_ram_update+0x44/0x12c
 persistent_ram_write+0x1a8/0x1b8
 ramoops_pstore_write+0x198/0x1e8
 pstore_console_write+0x94/0xe0
 console_unlock+0x3a4/0x5d8
 register_console+0x3ac/0x488
 pstore_register+0x1f8/0x204
 ramoops_probe+0x460/0x508
 platform_drv_probe+0xb0/0xf4
 really_probe+0x1c8/0x7d0
 driver_probe_device+0xc0/0x140
 __device_attach_driver+0x10c/0x200
 bus_for_each_drv+0xa8/0x11c
 __device_attach.llvm.1835537715725466631+0xf0/0x19c
 bus_probe_device+0x40/0xe0
 device_add+0x964/0xb8c
 of_device_add+0x40/0x54
 of_platform_device_create_pdata+0xb8/0x140
 of_platform_default_populate_init+0x5c/0xd4

case B
 sysdump_panic_event+0x720/0xd38
 atomic_notifier_call_chain+0x58/0xc0
 panic+0x1c4/0x6e4
 die+0x3c0/0x428
 bug_handler+0x4c/0x9c
 brk_handler+0x98/0x14c
 do_debug_exception+0x114/0x2ec
 el1_dbg+0x18/0xbc
 usercopy_abort+0x90/0x94
 __check_object_size+0x17c/0x2c4
 persistent_ram_update_user+0x50/0x220
 persistent_ram_write_user+0x354/0x428
 ramoops_pstore_write_user+0x34/0x50
 write_pmsg+0x14c/0x26c
 do_iter_write+0x1cc/0x2cc
 vfs_writev+0xf4/0x168
 do_writev+0xa4/0x200
 __arm64_sys_writev+0x20/0x2c
 el0_svc_common+0xc8/0x22c
 el0_svc_handler+0x1c/0x28
 el0_svc+0x8/0x100
>
> What do you all think, makes sense or could we pollute dmesg too much?
> Cheers!
Kees Cook Aug. 4, 2023, 8:10 a.m. UTC | #3
On Tue, Aug 01, 2023 at 02:04:32PM +0800, Yunlong Xing wrote:
> From: Enlin Mu <enlin.mu@unisoc.com>
> 
> The commit 30696378f68a("pstore/ram: Do not treat empty buffers as valid")
> would introduce the following issue:
> 
> When finding the buffer_size is zero, it would return directly.However, at
> the same time, if the buffer's start is a illegal value, the others would
> panic if access the buffer.

Which "others" do you mean?

> To avoid these happenning, check if the members are legal during the
> initialization phase of the pstore.
> 
> Fixes: 30696378f68a ("pstore/ram: Do not treat empty buffers as valid")
> Cc: stable@vger.kernel.org
> Signed-off-by: Enlin Mu <enlin.mu@unisoc.com>
> ---
>  fs/pstore/ram_core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c
> index 85aaf0fc6d7d..eb6df190d752 100644
> --- a/fs/pstore/ram_core.c
> +++ b/fs/pstore/ram_core.c
> @@ -519,7 +519,7 @@ static int persistent_ram_post_init(struct persistent_ram_zone *prz, u32 sig,
>  	sig ^= PERSISTENT_RAM_SIG;
>  
>  	if (prz->buffer->sig == sig) {
> -		if (buffer_size(prz) == 0) {
> +		if (buffer_size(prz) == 0 && buffer_start(prz) == 0) {
>  			pr_debug("found existing empty buffer\n");
>  			return 0;
>  		}

And in the case of "buffer_size(prz) == 0" but "buffer_start(prz) != 0",
this will be caught by:

                if (buffer_size(prz) > prz->buffer_size ||
                    buffer_start(prz) > buffer_size(prz)) {
                        pr_info("found existing invalid buffer, size %zu, start %zu\n",
                                buffer_size(prz), buffer_start(prz));
                        zap = true;
                }

i.e. it will be detected and zapped back to a sane state.

That sounds correct to me, though I wonder if reporting it as an
"invalid buffer" is inaccurate? Perhaps we should have a separate case:

		if (buffer_size(prz) == 0) {
			if (buffer_start(prz) == 0)
				pr_debug("found existing empty buffer\n");
			else {
				pr_debug("found existing empty buffer with non-zero start\n");
				zap = true;
			}
		} else if ...

What do you think?
yunlong xing Aug. 4, 2023, 8:59 a.m. UTC | #4
On Fri, Aug 4, 2023 at 4:10 PM Kees Cook <keescook@chromium.org> wrote:
>
> On Tue, Aug 01, 2023 at 02:04:32PM +0800, Yunlong Xing wrote:
> > From: Enlin Mu <enlin.mu@unisoc.com>
> >
> > The commit 30696378f68a("pstore/ram: Do not treat empty buffers as valid")
> > would introduce the following issue:
> >
> > When finding the buffer_size is zero, it would return directly.However, at
> > the same time, if the buffer's start is a illegal value, the others would
> > panic if access the buffer.
>
> Which "others" do you mean?

About “others", You can refer to the following panic call stack:
 sysdump_panic_event+0x720/0xd38
 atomic_notifier_call_chain+0x58/0xc0
 panic+0x1c4/0x6e4
 die+0x3c0/0x428
 bug_handler+0x4c/0x9c
 brk_handler+0x98/0x14c
 do_debug_exception+0x114/0x2ec
 el1_dbg+0x18/0xbc
 usercopy_abort+0x90/0x94
 __check_object_size+0x17c/0x2c4
 persistent_ram_update_user+0x50/0x220
 persistent_ram_write_user+0x354/0x428
 ramoops_pstore_write_user+0x34/0x50
 write_pmsg+0x14c/0x26c
 do_iter_write+0x1cc/0x2cc
 vfs_writev+0xf4/0x168
 do_writev+0xa4/0x200
 __arm64_sys_writev+0x20/0x2c
 el0_svc_common+0xc8/0x22c
 el0_svc_handler+0x1c/0x28
 el0_svc+0x8/0x100
>
> > To avoid these happenning, check if the members are legal during the
> > initialization phase of the pstore.
> >
> > Fixes: 30696378f68a ("pstore/ram: Do not treat empty buffers as valid")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Enlin Mu <enlin.mu@unisoc.com>
> > ---
> >  fs/pstore/ram_core.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c
> > index 85aaf0fc6d7d..eb6df190d752 100644
> > --- a/fs/pstore/ram_core.c
> > +++ b/fs/pstore/ram_core.c
> > @@ -519,7 +519,7 @@ static int persistent_ram_post_init(struct persistent_ram_zone *prz, u32 sig,
> >       sig ^= PERSISTENT_RAM_SIG;
> >
> >       if (prz->buffer->sig == sig) {
> > -             if (buffer_size(prz) == 0) {
> > +             if (buffer_size(prz) == 0 && buffer_start(prz) == 0) {
> >                       pr_debug("found existing empty buffer\n");
> >                       return 0;
> >               }
>
> And in the case of "buffer_size(prz) == 0" but "buffer_start(prz) != 0",
> this will be caught by:
>
>                 if (buffer_size(prz) > prz->buffer_size ||
>                     buffer_start(prz) > buffer_size(prz)) {
>                         pr_info("found existing invalid buffer, size %zu, start %zu\n",
>                                 buffer_size(prz), buffer_start(prz));
>                         zap = true;
>                 }
>
> i.e. it will be detected and zapped back to a sane state.
No,This code has no chance of execution because there was a return 0 before it
>
> That sounds correct to me, though I wonder if reporting it as an
> "invalid buffer" is inaccurate? Perhaps we should have a separate case:
>
>                 if (buffer_size(prz) == 0) {
>                         if (buffer_start(prz) == 0)
>                                 pr_debug("found existing empty buffer\n");
>                         else {
>                                 pr_debug("found existing empty buffer with non-zero start\n");
>                                 zap = true;
>                         }
>                 } else if ...
>
> What do you think?
Good, I gree it. For me, it should not return directly while finding
the buffer_size is zero, We need Check others case.
So does the modification method you mentioned require me to resubmit a
patch or do you need to modify and merge it
>
> --
> Kees Cook
Kees Cook Aug. 4, 2023, 4:53 p.m. UTC | #5
On Fri, Aug 04, 2023 at 04:59:07PM +0800, yunlong xing wrote:
> On Fri, Aug 4, 2023 at 4:10 PM Kees Cook <keescook@chromium.org> wrote:
> >
> > On Tue, Aug 01, 2023 at 02:04:32PM +0800, Yunlong Xing wrote:
> > > From: Enlin Mu <enlin.mu@unisoc.com>
> > >
> > > The commit 30696378f68a("pstore/ram: Do not treat empty buffers as valid")
> > > would introduce the following issue:
> > >
> > > When finding the buffer_size is zero, it would return directly.However, at
> > > the same time, if the buffer's start is a illegal value, the others would
> > > panic if access the buffer.
> >
> > Which "others" do you mean?
> 
> About “others", You can refer to the following panic call stack:
>  sysdump_panic_event+0x720/0xd38
>  atomic_notifier_call_chain+0x58/0xc0
>  panic+0x1c4/0x6e4
>  die+0x3c0/0x428
>  bug_handler+0x4c/0x9c
>  brk_handler+0x98/0x14c
>  do_debug_exception+0x114/0x2ec
>  el1_dbg+0x18/0xbc
>  usercopy_abort+0x90/0x94
>  __check_object_size+0x17c/0x2c4
>  persistent_ram_update_user+0x50/0x220
>  persistent_ram_write_user+0x354/0x428
>  ramoops_pstore_write_user+0x34/0x50
>  write_pmsg+0x14c/0x26c

I see -- the "start" is corrupted and out of bounds, which leads to
these accesses.

>  do_iter_write+0x1cc/0x2cc
>  vfs_writev+0xf4/0x168
>  do_writev+0xa4/0x200
>  __arm64_sys_writev+0x20/0x2c
>  el0_svc_common+0xc8/0x22c
>  el0_svc_handler+0x1c/0x28
>  el0_svc+0x8/0x100
> >
> > > To avoid these happenning, check if the members are legal during the
> > > initialization phase of the pstore.
> > >
> > > Fixes: 30696378f68a ("pstore/ram: Do not treat empty buffers as valid")
> > > Cc: stable@vger.kernel.org
> > > Signed-off-by: Enlin Mu <enlin.mu@unisoc.com>
> > > ---
> > >  fs/pstore/ram_core.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c
> > > index 85aaf0fc6d7d..eb6df190d752 100644
> > > --- a/fs/pstore/ram_core.c
> > > +++ b/fs/pstore/ram_core.c
> > > @@ -519,7 +519,7 @@ static int persistent_ram_post_init(struct persistent_ram_zone *prz, u32 sig,
> > >       sig ^= PERSISTENT_RAM_SIG;
> > >
> > >       if (prz->buffer->sig == sig) {
> > > -             if (buffer_size(prz) == 0) {
> > > +             if (buffer_size(prz) == 0 && buffer_start(prz) == 0) {
> > >                       pr_debug("found existing empty buffer\n");
> > >                       return 0;
> > >               }
> >
> > And in the case of "buffer_size(prz) == 0" but "buffer_start(prz) != 0",
> > this will be caught by:
> >
> >                 if (buffer_size(prz) > prz->buffer_size ||
> >                     buffer_start(prz) > buffer_size(prz)) {
> >                         pr_info("found existing invalid buffer, size %zu, start %zu\n",
> >                                 buffer_size(prz), buffer_start(prz));
> >                         zap = true;
> >                 }
> >
> > i.e. it will be detected and zapped back to a sane state.
> No,This code has no chance of execution because there was a return 0 before it

Right, I meant the behavior with your patch -- with your patch the case
of "size == 0 && start != 0" would be caught by the above check ("start > size")
and zapped back to sanity. (Which is the correct result.)

> >
> > That sounds correct to me, though I wonder if reporting it as an
> > "invalid buffer" is inaccurate? Perhaps we should have a separate case:
> >
> >                 if (buffer_size(prz) == 0) {
> >                         if (buffer_start(prz) == 0)
> >                                 pr_debug("found existing empty buffer\n");
> >                         else {
> >                                 pr_debug("found existing empty buffer with non-zero start\n");
> >                                 zap = true;
> >                         }
> >                 } else if ...
> >
> > What do you think?
> Good, I gree it. For me, it should not return directly while finding
> the buffer_size is zero, We need Check others case.

Right. The only question I have is: how did the "start" get corrupted,
and is that a notable condition? Right now we don't (info-level) log
a size==0 prz since that's an expected state for a regular initialized
prz. So maybe your patch is correct as-is since we'd want to report the
"found existing invalid buffer" case.

> So does the modification method you mentioned require me to resubmit a
> patch or do you need to modify and merge it

I think I'll update the commit log and take this as-is. If the logging
becomes too noisy, we can adjust the case later.

Thanks!
Kees Cook Aug. 4, 2023, 5:04 p.m. UTC | #6
On Tue, 01 Aug 2023 14:04:32 +0800, Yunlong Xing wrote:
> The commit 30696378f68a("pstore/ram: Do not treat empty buffers as valid")
> would introduce the following issue:
> 
> When finding the buffer_size is zero, it would return directly.However, at
> the same time, if the buffer's start is a illegal value, the others would
> panic if access the buffer.
> 
> [...]

Applied to for-next/pstore, thanks!

[1/1] pstore/ram: Check member of buffers during the initialization phase of the pstore
      https://git.kernel.org/kees/c/fe8c3623ab06

Take care,
yunlong xing Aug. 7, 2023, 1:33 a.m. UTC | #7
On Sat, Aug 5, 2023 at 12:53 AM Kees Cook <keescook@chromium.org> wrote:
>
> On Fri, Aug 04, 2023 at 04:59:07PM +0800, yunlong xing wrote:
> > On Fri, Aug 4, 2023 at 4:10 PM Kees Cook <keescook@chromium.org> wrote:
> > >
> > > On Tue, Aug 01, 2023 at 02:04:32PM +0800, Yunlong Xing wrote:
> > > > From: Enlin Mu <enlin.mu@unisoc.com>
> > > >
> > > > The commit 30696378f68a("pstore/ram: Do not treat empty buffers as valid")
> > > > would introduce the following issue:
> > > >
> > > > When finding the buffer_size is zero, it would return directly.However, at
> > > > the same time, if the buffer's start is a illegal value, the others would
> > > > panic if access the buffer.
> > >
> > > Which "others" do you mean?
> >
> > About “others", You can refer to the following panic call stack:
> >  sysdump_panic_event+0x720/0xd38
> >  atomic_notifier_call_chain+0x58/0xc0
> >  panic+0x1c4/0x6e4
> >  die+0x3c0/0x428
> >  bug_handler+0x4c/0x9c
> >  brk_handler+0x98/0x14c
> >  do_debug_exception+0x114/0x2ec
> >  el1_dbg+0x18/0xbc
> >  usercopy_abort+0x90/0x94
> >  __check_object_size+0x17c/0x2c4
> >  persistent_ram_update_user+0x50/0x220
> >  persistent_ram_write_user+0x354/0x428
> >  ramoops_pstore_write_user+0x34/0x50
> >  write_pmsg+0x14c/0x26c
>
> I see -- the "start" is corrupted and out of bounds, which leads to
> these accesses.
>
> >  do_iter_write+0x1cc/0x2cc
> >  vfs_writev+0xf4/0x168
> >  do_writev+0xa4/0x200
> >  __arm64_sys_writev+0x20/0x2c
> >  el0_svc_common+0xc8/0x22c
> >  el0_svc_handler+0x1c/0x28
> >  el0_svc+0x8/0x100
> > >
> > > > To avoid these happenning, check if the members are legal during the
> > > > initialization phase of the pstore.
> > > >
> > > > Fixes: 30696378f68a ("pstore/ram: Do not treat empty buffers as valid")
> > > > Cc: stable@vger.kernel.org
> > > > Signed-off-by: Enlin Mu <enlin.mu@unisoc.com>
> > > > ---
> > > >  fs/pstore/ram_core.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c
> > > > index 85aaf0fc6d7d..eb6df190d752 100644
> > > > --- a/fs/pstore/ram_core.c
> > > > +++ b/fs/pstore/ram_core.c
> > > > @@ -519,7 +519,7 @@ static int persistent_ram_post_init(struct persistent_ram_zone *prz, u32 sig,
> > > >       sig ^= PERSISTENT_RAM_SIG;
> > > >
> > > >       if (prz->buffer->sig == sig) {
> > > > -             if (buffer_size(prz) == 0) {
> > > > +             if (buffer_size(prz) == 0 && buffer_start(prz) == 0) {
> > > >                       pr_debug("found existing empty buffer\n");
> > > >                       return 0;
> > > >               }
> > >
> > > And in the case of "buffer_size(prz) == 0" but "buffer_start(prz) != 0",
> > > this will be caught by:
> > >
> > >                 if (buffer_size(prz) > prz->buffer_size ||
> > >                     buffer_start(prz) > buffer_size(prz)) {
> > >                         pr_info("found existing invalid buffer, size %zu, start %zu\n",
> > >                                 buffer_size(prz), buffer_start(prz));
> > >                         zap = true;
> > >                 }
> > >
> > > i.e. it will be detected and zapped back to a sane state.
> > No,This code has no chance of execution because there was a return 0 before it
>
> Right, I meant the behavior with your patch -- with your patch the case
> of "size == 0 && start != 0" would be caught by the above check ("start > size")
> and zapped back to sanity. (Which is the correct result.)
>
> > >
> > > That sounds correct to me, though I wonder if reporting it as an
> > > "invalid buffer" is inaccurate? Perhaps we should have a separate case:
> > >
> > >                 if (buffer_size(prz) == 0) {
> > >                         if (buffer_start(prz) == 0)
> > >                                 pr_debug("found existing empty buffer\n");
> > >                         else {
> > >                                 pr_debug("found existing empty buffer with non-zero start\n");
> > >                                 zap = true;
> > >                         }
> > >                 } else if ...
> > >
> > > What do you think?
> > Good, I gree it. For me, it should not return directly while finding
> > the buffer_size is zero, We need Check others case.
>
> Right. The only question I have is: how did the "start" get corrupted,
> and is that a notable condition? Right now we don't (info-level) log
> a size==0 prz since that's an expected state for a regular initialized
> prz. So maybe your patch is correct as-is since we'd want to report the
> "found existing invalid buffer" case.

From the last reboot to this initialization, the ddr was not stable
enough, resulting in
a jump in the start value.I hope to add error handling mechanisms to
avoid abnormal
data being used.
Thanks!
>
> > So does the modification method you mentioned require me to resubmit a
> > patch or do you need to modify and merge it
>
> I think I'll update the commit log and take this as-is. If the logging
> becomes too noisy, we can adjust the case later.
>
> Thanks!
>
> --
> Kees Cook
diff mbox series

Patch

diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c
index 85aaf0fc6d7d..eb6df190d752 100644
--- a/fs/pstore/ram_core.c
+++ b/fs/pstore/ram_core.c
@@ -519,7 +519,7 @@  static int persistent_ram_post_init(struct persistent_ram_zone *prz, u32 sig,
 	sig ^= PERSISTENT_RAM_SIG;
 
 	if (prz->buffer->sig == sig) {
-		if (buffer_size(prz) == 0) {
+		if (buffer_size(prz) == 0 && buffer_start(prz) == 0) {
 			pr_debug("found existing empty buffer\n");
 			return 0;
 		}