Message ID | 20230801060432.1307717-1-yunlong.xing@unisoc.com (mailing list archive) |
---|---|
State | Mainlined |
Commit | fe8c3623ab06603eb760444a032d426542212021 |
Headers | show |
Series | [1/1] pstore/ram: Check member of buffers during the initialization phase of the pstore | expand |
On 01/08/2023 03:04, Yunlong Xing wrote: > [...] > diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c > index 85aaf0fc6d7d..eb6df190d752 100644 > --- a/fs/pstore/ram_core.c > +++ b/fs/pstore/ram_core.c > @@ -519,7 +519,7 @@ static int persistent_ram_post_init(struct persistent_ram_zone *prz, u32 sig, > sig ^= PERSISTENT_RAM_SIG; > > if (prz->buffer->sig == sig) { > - if (buffer_size(prz) == 0) { > + if (buffer_size(prz) == 0 && buffer_start(prz) == 0) { > pr_debug("found existing empty buffer\n"); Thanks for the patch! I'd also adjust the above print statement to reflect the different paths (empty buffers vs illegal one) and maybe bump it to pr_info, or even pr_warn(_once?). What do you all think, makes sense or could we pollute dmesg too much? Cheers!
On Wed, Aug 2, 2023 at 11:47 PM Guilherme G. Piccoli <gpiccoli@igalia.com> wrote: > > On 01/08/2023 03:04, Yunlong Xing wrote: > > [...] > > > diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c > > index 85aaf0fc6d7d..eb6df190d752 100644 > > --- a/fs/pstore/ram_core.c > > +++ b/fs/pstore/ram_core.c > > @@ -519,7 +519,7 @@ static int persistent_ram_post_init(struct persistent_ram_zone *prz, u32 sig, > > sig ^= PERSISTENT_RAM_SIG; > > > > if (prz->buffer->sig == sig) { > > - if (buffer_size(prz) == 0) { > > + if (buffer_size(prz) == 0 && buffer_start(prz) == 0) { > > pr_debug("found existing empty buffer\n"); > > Thanks for the patch! I'd also adjust the above print statement to > reflect the different paths (empty buffers vs illegal one) and maybe > bump it to pr_info, or even pr_warn(_once?). I don't think it's necessary. Through testing, the probability of this situation occurring is still very low. Using 20 devices, reboot every 10 seconds, and only once after 96 hours. The panic stack information is as follows: case A: sysdump_panic_event+0x3b4/0x5b8 atomic_notifier_call_chain+0x54/0x90 panic+0x1c8/0x42c die+0x29c/0x2a8 die_kernel_fault+0x68/0x78 __do_kernel_fault+0x1c4/0x1e0 do_bad_area+0x40/0x100 do_translation_fault+0x68/0x80 do_mem_abort+0x68/0xf8 el1_da+0x1c/0xc0 __raw_writeb+0x38/0x174 __memcpy_toio+0x40/0xac persistent_ram_update+0x44/0x12c persistent_ram_write+0x1a8/0x1b8 ramoops_pstore_write+0x198/0x1e8 pstore_console_write+0x94/0xe0 console_unlock+0x3a4/0x5d8 register_console+0x3ac/0x488 pstore_register+0x1f8/0x204 ramoops_probe+0x460/0x508 platform_drv_probe+0xb0/0xf4 really_probe+0x1c8/0x7d0 driver_probe_device+0xc0/0x140 __device_attach_driver+0x10c/0x200 bus_for_each_drv+0xa8/0x11c __device_attach.llvm.1835537715725466631+0xf0/0x19c bus_probe_device+0x40/0xe0 device_add+0x964/0xb8c of_device_add+0x40/0x54 of_platform_device_create_pdata+0xb8/0x140 of_platform_default_populate_init+0x5c/0xd4 case B sysdump_panic_event+0x720/0xd38 atomic_notifier_call_chain+0x58/0xc0 panic+0x1c4/0x6e4 die+0x3c0/0x428 bug_handler+0x4c/0x9c brk_handler+0x98/0x14c do_debug_exception+0x114/0x2ec el1_dbg+0x18/0xbc usercopy_abort+0x90/0x94 __check_object_size+0x17c/0x2c4 persistent_ram_update_user+0x50/0x220 persistent_ram_write_user+0x354/0x428 ramoops_pstore_write_user+0x34/0x50 write_pmsg+0x14c/0x26c do_iter_write+0x1cc/0x2cc vfs_writev+0xf4/0x168 do_writev+0xa4/0x200 __arm64_sys_writev+0x20/0x2c el0_svc_common+0xc8/0x22c el0_svc_handler+0x1c/0x28 el0_svc+0x8/0x100 > > What do you all think, makes sense or could we pollute dmesg too much? > Cheers!
On Tue, Aug 01, 2023 at 02:04:32PM +0800, Yunlong Xing wrote: > From: Enlin Mu <enlin.mu@unisoc.com> > > The commit 30696378f68a("pstore/ram: Do not treat empty buffers as valid") > would introduce the following issue: > > When finding the buffer_size is zero, it would return directly.However, at > the same time, if the buffer's start is a illegal value, the others would > panic if access the buffer. Which "others" do you mean? > To avoid these happenning, check if the members are legal during the > initialization phase of the pstore. > > Fixes: 30696378f68a ("pstore/ram: Do not treat empty buffers as valid") > Cc: stable@vger.kernel.org > Signed-off-by: Enlin Mu <enlin.mu@unisoc.com> > --- > fs/pstore/ram_core.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c > index 85aaf0fc6d7d..eb6df190d752 100644 > --- a/fs/pstore/ram_core.c > +++ b/fs/pstore/ram_core.c > @@ -519,7 +519,7 @@ static int persistent_ram_post_init(struct persistent_ram_zone *prz, u32 sig, > sig ^= PERSISTENT_RAM_SIG; > > if (prz->buffer->sig == sig) { > - if (buffer_size(prz) == 0) { > + if (buffer_size(prz) == 0 && buffer_start(prz) == 0) { > pr_debug("found existing empty buffer\n"); > return 0; > } And in the case of "buffer_size(prz) == 0" but "buffer_start(prz) != 0", this will be caught by: if (buffer_size(prz) > prz->buffer_size || buffer_start(prz) > buffer_size(prz)) { pr_info("found existing invalid buffer, size %zu, start %zu\n", buffer_size(prz), buffer_start(prz)); zap = true; } i.e. it will be detected and zapped back to a sane state. That sounds correct to me, though I wonder if reporting it as an "invalid buffer" is inaccurate? Perhaps we should have a separate case: if (buffer_size(prz) == 0) { if (buffer_start(prz) == 0) pr_debug("found existing empty buffer\n"); else { pr_debug("found existing empty buffer with non-zero start\n"); zap = true; } } else if ... What do you think?
On Fri, Aug 4, 2023 at 4:10 PM Kees Cook <keescook@chromium.org> wrote: > > On Tue, Aug 01, 2023 at 02:04:32PM +0800, Yunlong Xing wrote: > > From: Enlin Mu <enlin.mu@unisoc.com> > > > > The commit 30696378f68a("pstore/ram: Do not treat empty buffers as valid") > > would introduce the following issue: > > > > When finding the buffer_size is zero, it would return directly.However, at > > the same time, if the buffer's start is a illegal value, the others would > > panic if access the buffer. > > Which "others" do you mean? About “others", You can refer to the following panic call stack: sysdump_panic_event+0x720/0xd38 atomic_notifier_call_chain+0x58/0xc0 panic+0x1c4/0x6e4 die+0x3c0/0x428 bug_handler+0x4c/0x9c brk_handler+0x98/0x14c do_debug_exception+0x114/0x2ec el1_dbg+0x18/0xbc usercopy_abort+0x90/0x94 __check_object_size+0x17c/0x2c4 persistent_ram_update_user+0x50/0x220 persistent_ram_write_user+0x354/0x428 ramoops_pstore_write_user+0x34/0x50 write_pmsg+0x14c/0x26c do_iter_write+0x1cc/0x2cc vfs_writev+0xf4/0x168 do_writev+0xa4/0x200 __arm64_sys_writev+0x20/0x2c el0_svc_common+0xc8/0x22c el0_svc_handler+0x1c/0x28 el0_svc+0x8/0x100 > > > To avoid these happenning, check if the members are legal during the > > initialization phase of the pstore. > > > > Fixes: 30696378f68a ("pstore/ram: Do not treat empty buffers as valid") > > Cc: stable@vger.kernel.org > > Signed-off-by: Enlin Mu <enlin.mu@unisoc.com> > > --- > > fs/pstore/ram_core.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c > > index 85aaf0fc6d7d..eb6df190d752 100644 > > --- a/fs/pstore/ram_core.c > > +++ b/fs/pstore/ram_core.c > > @@ -519,7 +519,7 @@ static int persistent_ram_post_init(struct persistent_ram_zone *prz, u32 sig, > > sig ^= PERSISTENT_RAM_SIG; > > > > if (prz->buffer->sig == sig) { > > - if (buffer_size(prz) == 0) { > > + if (buffer_size(prz) == 0 && buffer_start(prz) == 0) { > > pr_debug("found existing empty buffer\n"); > > return 0; > > } > > And in the case of "buffer_size(prz) == 0" but "buffer_start(prz) != 0", > this will be caught by: > > if (buffer_size(prz) > prz->buffer_size || > buffer_start(prz) > buffer_size(prz)) { > pr_info("found existing invalid buffer, size %zu, start %zu\n", > buffer_size(prz), buffer_start(prz)); > zap = true; > } > > i.e. it will be detected and zapped back to a sane state. No,This code has no chance of execution because there was a return 0 before it > > That sounds correct to me, though I wonder if reporting it as an > "invalid buffer" is inaccurate? Perhaps we should have a separate case: > > if (buffer_size(prz) == 0) { > if (buffer_start(prz) == 0) > pr_debug("found existing empty buffer\n"); > else { > pr_debug("found existing empty buffer with non-zero start\n"); > zap = true; > } > } else if ... > > What do you think? Good, I gree it. For me, it should not return directly while finding the buffer_size is zero, We need Check others case. So does the modification method you mentioned require me to resubmit a patch or do you need to modify and merge it > > -- > Kees Cook
On Fri, Aug 04, 2023 at 04:59:07PM +0800, yunlong xing wrote: > On Fri, Aug 4, 2023 at 4:10 PM Kees Cook <keescook@chromium.org> wrote: > > > > On Tue, Aug 01, 2023 at 02:04:32PM +0800, Yunlong Xing wrote: > > > From: Enlin Mu <enlin.mu@unisoc.com> > > > > > > The commit 30696378f68a("pstore/ram: Do not treat empty buffers as valid") > > > would introduce the following issue: > > > > > > When finding the buffer_size is zero, it would return directly.However, at > > > the same time, if the buffer's start is a illegal value, the others would > > > panic if access the buffer. > > > > Which "others" do you mean? > > About “others", You can refer to the following panic call stack: > sysdump_panic_event+0x720/0xd38 > atomic_notifier_call_chain+0x58/0xc0 > panic+0x1c4/0x6e4 > die+0x3c0/0x428 > bug_handler+0x4c/0x9c > brk_handler+0x98/0x14c > do_debug_exception+0x114/0x2ec > el1_dbg+0x18/0xbc > usercopy_abort+0x90/0x94 > __check_object_size+0x17c/0x2c4 > persistent_ram_update_user+0x50/0x220 > persistent_ram_write_user+0x354/0x428 > ramoops_pstore_write_user+0x34/0x50 > write_pmsg+0x14c/0x26c I see -- the "start" is corrupted and out of bounds, which leads to these accesses. > do_iter_write+0x1cc/0x2cc > vfs_writev+0xf4/0x168 > do_writev+0xa4/0x200 > __arm64_sys_writev+0x20/0x2c > el0_svc_common+0xc8/0x22c > el0_svc_handler+0x1c/0x28 > el0_svc+0x8/0x100 > > > > > To avoid these happenning, check if the members are legal during the > > > initialization phase of the pstore. > > > > > > Fixes: 30696378f68a ("pstore/ram: Do not treat empty buffers as valid") > > > Cc: stable@vger.kernel.org > > > Signed-off-by: Enlin Mu <enlin.mu@unisoc.com> > > > --- > > > fs/pstore/ram_core.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c > > > index 85aaf0fc6d7d..eb6df190d752 100644 > > > --- a/fs/pstore/ram_core.c > > > +++ b/fs/pstore/ram_core.c > > > @@ -519,7 +519,7 @@ static int persistent_ram_post_init(struct persistent_ram_zone *prz, u32 sig, > > > sig ^= PERSISTENT_RAM_SIG; > > > > > > if (prz->buffer->sig == sig) { > > > - if (buffer_size(prz) == 0) { > > > + if (buffer_size(prz) == 0 && buffer_start(prz) == 0) { > > > pr_debug("found existing empty buffer\n"); > > > return 0; > > > } > > > > And in the case of "buffer_size(prz) == 0" but "buffer_start(prz) != 0", > > this will be caught by: > > > > if (buffer_size(prz) > prz->buffer_size || > > buffer_start(prz) > buffer_size(prz)) { > > pr_info("found existing invalid buffer, size %zu, start %zu\n", > > buffer_size(prz), buffer_start(prz)); > > zap = true; > > } > > > > i.e. it will be detected and zapped back to a sane state. > No,This code has no chance of execution because there was a return 0 before it Right, I meant the behavior with your patch -- with your patch the case of "size == 0 && start != 0" would be caught by the above check ("start > size") and zapped back to sanity. (Which is the correct result.) > > > > That sounds correct to me, though I wonder if reporting it as an > > "invalid buffer" is inaccurate? Perhaps we should have a separate case: > > > > if (buffer_size(prz) == 0) { > > if (buffer_start(prz) == 0) > > pr_debug("found existing empty buffer\n"); > > else { > > pr_debug("found existing empty buffer with non-zero start\n"); > > zap = true; > > } > > } else if ... > > > > What do you think? > Good, I gree it. For me, it should not return directly while finding > the buffer_size is zero, We need Check others case. Right. The only question I have is: how did the "start" get corrupted, and is that a notable condition? Right now we don't (info-level) log a size==0 prz since that's an expected state for a regular initialized prz. So maybe your patch is correct as-is since we'd want to report the "found existing invalid buffer" case. > So does the modification method you mentioned require me to resubmit a > patch or do you need to modify and merge it I think I'll update the commit log and take this as-is. If the logging becomes too noisy, we can adjust the case later. Thanks!
On Tue, 01 Aug 2023 14:04:32 +0800, Yunlong Xing wrote: > The commit 30696378f68a("pstore/ram: Do not treat empty buffers as valid") > would introduce the following issue: > > When finding the buffer_size is zero, it would return directly.However, at > the same time, if the buffer's start is a illegal value, the others would > panic if access the buffer. > > [...] Applied to for-next/pstore, thanks! [1/1] pstore/ram: Check member of buffers during the initialization phase of the pstore https://git.kernel.org/kees/c/fe8c3623ab06 Take care,
On Sat, Aug 5, 2023 at 12:53 AM Kees Cook <keescook@chromium.org> wrote: > > On Fri, Aug 04, 2023 at 04:59:07PM +0800, yunlong xing wrote: > > On Fri, Aug 4, 2023 at 4:10 PM Kees Cook <keescook@chromium.org> wrote: > > > > > > On Tue, Aug 01, 2023 at 02:04:32PM +0800, Yunlong Xing wrote: > > > > From: Enlin Mu <enlin.mu@unisoc.com> > > > > > > > > The commit 30696378f68a("pstore/ram: Do not treat empty buffers as valid") > > > > would introduce the following issue: > > > > > > > > When finding the buffer_size is zero, it would return directly.However, at > > > > the same time, if the buffer's start is a illegal value, the others would > > > > panic if access the buffer. > > > > > > Which "others" do you mean? > > > > About “others", You can refer to the following panic call stack: > > sysdump_panic_event+0x720/0xd38 > > atomic_notifier_call_chain+0x58/0xc0 > > panic+0x1c4/0x6e4 > > die+0x3c0/0x428 > > bug_handler+0x4c/0x9c > > brk_handler+0x98/0x14c > > do_debug_exception+0x114/0x2ec > > el1_dbg+0x18/0xbc > > usercopy_abort+0x90/0x94 > > __check_object_size+0x17c/0x2c4 > > persistent_ram_update_user+0x50/0x220 > > persistent_ram_write_user+0x354/0x428 > > ramoops_pstore_write_user+0x34/0x50 > > write_pmsg+0x14c/0x26c > > I see -- the "start" is corrupted and out of bounds, which leads to > these accesses. > > > do_iter_write+0x1cc/0x2cc > > vfs_writev+0xf4/0x168 > > do_writev+0xa4/0x200 > > __arm64_sys_writev+0x20/0x2c > > el0_svc_common+0xc8/0x22c > > el0_svc_handler+0x1c/0x28 > > el0_svc+0x8/0x100 > > > > > > > To avoid these happenning, check if the members are legal during the > > > > initialization phase of the pstore. > > > > > > > > Fixes: 30696378f68a ("pstore/ram: Do not treat empty buffers as valid") > > > > Cc: stable@vger.kernel.org > > > > Signed-off-by: Enlin Mu <enlin.mu@unisoc.com> > > > > --- > > > > fs/pstore/ram_core.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c > > > > index 85aaf0fc6d7d..eb6df190d752 100644 > > > > --- a/fs/pstore/ram_core.c > > > > +++ b/fs/pstore/ram_core.c > > > > @@ -519,7 +519,7 @@ static int persistent_ram_post_init(struct persistent_ram_zone *prz, u32 sig, > > > > sig ^= PERSISTENT_RAM_SIG; > > > > > > > > if (prz->buffer->sig == sig) { > > > > - if (buffer_size(prz) == 0) { > > > > + if (buffer_size(prz) == 0 && buffer_start(prz) == 0) { > > > > pr_debug("found existing empty buffer\n"); > > > > return 0; > > > > } > > > > > > And in the case of "buffer_size(prz) == 0" but "buffer_start(prz) != 0", > > > this will be caught by: > > > > > > if (buffer_size(prz) > prz->buffer_size || > > > buffer_start(prz) > buffer_size(prz)) { > > > pr_info("found existing invalid buffer, size %zu, start %zu\n", > > > buffer_size(prz), buffer_start(prz)); > > > zap = true; > > > } > > > > > > i.e. it will be detected and zapped back to a sane state. > > No,This code has no chance of execution because there was a return 0 before it > > Right, I meant the behavior with your patch -- with your patch the case > of "size == 0 && start != 0" would be caught by the above check ("start > size") > and zapped back to sanity. (Which is the correct result.) > > > > > > > That sounds correct to me, though I wonder if reporting it as an > > > "invalid buffer" is inaccurate? Perhaps we should have a separate case: > > > > > > if (buffer_size(prz) == 0) { > > > if (buffer_start(prz) == 0) > > > pr_debug("found existing empty buffer\n"); > > > else { > > > pr_debug("found existing empty buffer with non-zero start\n"); > > > zap = true; > > > } > > > } else if ... > > > > > > What do you think? > > Good, I gree it. For me, it should not return directly while finding > > the buffer_size is zero, We need Check others case. > > Right. The only question I have is: how did the "start" get corrupted, > and is that a notable condition? Right now we don't (info-level) log > a size==0 prz since that's an expected state for a regular initialized > prz. So maybe your patch is correct as-is since we'd want to report the > "found existing invalid buffer" case. From the last reboot to this initialization, the ddr was not stable enough, resulting in a jump in the start value.I hope to add error handling mechanisms to avoid abnormal data being used. Thanks! > > > So does the modification method you mentioned require me to resubmit a > > patch or do you need to modify and merge it > > I think I'll update the commit log and take this as-is. If the logging > becomes too noisy, we can adjust the case later. > > Thanks! > > -- > Kees Cook
diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c index 85aaf0fc6d7d..eb6df190d752 100644 --- a/fs/pstore/ram_core.c +++ b/fs/pstore/ram_core.c @@ -519,7 +519,7 @@ static int persistent_ram_post_init(struct persistent_ram_zone *prz, u32 sig, sig ^= PERSISTENT_RAM_SIG; if (prz->buffer->sig == sig) { - if (buffer_size(prz) == 0) { + if (buffer_size(prz) == 0 && buffer_start(prz) == 0) { pr_debug("found existing empty buffer\n"); return 0; }