Message ID | 20221202163728.392509-1-linmiaohe@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mce: fix missing stack-dumping in mce_panic() | expand |
> When machine check exception occurs, there is no stack-dumping now in > mce_panic(). It's because bust_spinlocks(1) is called prematurely so > oops_in_progress will be >= 2 when trying to call dump_stack() in > panic(). Thus dump_stack() won't be called as this is considered as > nested stack-dumping. I had an earlier patch series to just dump from "interesting" machine checks (I think the interesting ones are when the kernel hit poison in code that hadn't been tagged in the extable as recoverable) https://lore.kernel.org/all/20220922195136.54575-1-tony.luck@intel.com/ Discussion on that fizzled out. Thanks for tracking down why panic() didn't provide a stack dump. I'm still of the opinion that stack dumps from machine checks aren't generally useful. But I'd rather have extra stack dumps than no stack dumps at all. -Tony
On 2022/12/2 22:44, Luck, Tony wrote: >> When machine check exception occurs, there is no stack-dumping now in >> mce_panic(). It's because bust_spinlocks(1) is called prematurely so >> oops_in_progress will be >= 2 when trying to call dump_stack() in >> panic(). Thus dump_stack() won't be called as this is considered as >> nested stack-dumping. > Many thanks for your quick reply. :) > I had an earlier patch series to just dump from "interesting" machine checks > (I think the interesting ones are when the kernel hit poison in code that hadn't > been tagged in the extable as recoverable) > > https://lore.kernel.org/all/20220922195136.54575-1-tony.luck@intel.com/ [1] > > Discussion on that fizzled out. > > Thanks for tracking down why panic() didn't provide a stack dump. > > I'm still of the opinion that stack dumps from machine checks aren't > generally useful. But I'd rather have extra stack dumps than no stack > dumps at all. As you mentioned in [1]: """ In order to ease the hunt for additional code flows where machine check errors can be recovered it is useful to know, for example, why the kernel was copying a page. """ So I think it's better to have at least one stack dumps. Also what the commit 6e6f0a1f0fa6 ("panic: don't print redundant backtraces on oops") and commit 026ee1f66aaa ("panic: fix stack dump print on direct call to panic()") want to do is avoiding nested stack-dumping to have the original oops data being scrolled away on a 80x50 screen but to have *at least one backtraces*. So this patch acts more like a BUGFIX to ensure having at least one backtraces in mce_panic(). What's your thought, Luck? Thanks, Miaohe Lin
On Sat, Dec 03, 2022 at 10:19:32AM +0800, Miaohe Lin wrote: > So I think it's better to have at least one stack dumps. Also what the commit > 6e6f0a1f0fa6 ("panic: don't print redundant backtraces on oops") and commit > 026ee1f66aaa ("panic: fix stack dump print on direct call to panic()") want > to do is avoiding nested stack-dumping to have the original oops data being > scrolled away on a 80x50 screen but to have *at least one backtraces*. So > this patch acts more like a BUGFIX to ensure having at least one backtraces > in mce_panic(). What's your thought, Luck? I tried out your patch with the ras-tools test: # ./einj_mem_uc -f copyout which currently causes a panic from the "recoverable" machine check. Your patch worked fine: [ 112.457735] stack backtrace: [ 112.457736] CPU: 124 PID: 3401 Comm: einj_mem_uc Not tainted 6.1.0-rc7+ #41 [ 112.457738] Hardware name: Intel Corporation PURLEY/PURLEY, BIOS PLYDCRB1.86B.0154.R04.1804231104 04/23/2018 [ 112.457739] Call Trace: [ 112.457740] <#MC> [ 112.457742] dump_stack_lvl+0x5a/0x78 [ 112.457746] dump_stack+0x10/0x16 [ 112.457748] print_usage_bug.part.0+0x1ad/0x1c4 [ 112.457755] lock_acquire.cold+0x16/0x47 [ 112.457759] ? down_trylock+0x14/0x40 [ 112.457762] ? panic+0x180/0x2b9 [ 112.457766] _raw_spin_lock_irqsave+0x4e/0x70 [ 112.457768] ? down_trylock+0x14/0x40 [ 112.457771] down_trylock+0x14/0x40 [ 112.457772] ? panic+0x180/0x2b9 [ 112.457775] __down_trylock_console_sem+0x34/0xc0 [ 112.457778] console_unblank+0x1d/0x90 [ 112.457781] panic+0x180/0x2b9 [ 112.457788] mce_panic+0x118/0x1e0 [ 112.457794] do_machine_check+0x79a/0x890 [ 112.457800] ? copy_user_enhanced_fast_string+0xa/0x50 [ 112.457810] exc_machine_check+0x76/0xb0 [ 112.457813] asm_exc_machine_check+0x1a/0x40 [ 112.457816] RIP: 0010:copy_user_enhanced_fast_string+0xa/0x50 [ 112.457819] Code: d1 f3 a4 31 c0 0f 01 ca c3 cc cc cc cc 8d 0c ca 89 ca eb 2c 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 01 cb 83 fa 40 72 48 89 d1 <f3> a4 31 c0 0f 01 ca c3 cc cc cc cc 89 ca eb 06 66 0f 1f 44 00 00 [ 112.457820] RSP: 0018:ffffb140f789bbd8 EFLAGS: 00050206 [ 112.457822] RAX: 0000000000001000 RBX: ffffb140f789be58 RCX: 0000000000000c00 [ 112.457824] RDX: 0000000000001000 RSI: ffff9133eb3f0400 RDI: 000055c1a36986c0 [ 112.457825] RBP: ffffb140f789bc68 R08: 00000000f789be01 R09: 0000000000001000 [ 112.457827] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000001000 [ 112.457828] R13: ffff9133eb3f0000 R14: 0000000000001000 R15: 0000000000000000 [ 112.457837] </#MC> [ 112.457838] <TASK> [ 112.457838] ? _copy_to_iter+0xc3/0x6f0 [ 112.457843] ? filemap_get_pages+0x9b/0x670 [ 112.457851] copy_page_to_iter+0x7c/0x1f0 [ 112.457854] ? find_held_lock+0x31/0x90 [ 112.457858] filemap_read+0x1ec/0x390 [ 112.457865] ? __fsnotify_parent+0x10f/0x310 [ 112.457867] ? aa_file_perm+0x1ab/0x610 [ 112.457875] generic_file_read_iter+0xf4/0x170 [ 112.457879] ext4_file_read_iter+0x5b/0x1e0 [ 112.457881] ? security_file_permission+0x4e/0x60 [ 112.457886] vfs_read+0x208/0x2e0 [ 112.457895] ksys_read+0x6d/0xf0 [ 112.457900] __x64_sys_read+0x19/0x20 [ 112.457902] do_syscall_64+0x38/0x90 [ 112.457906] entry_SYSCALL_64_after_hwframe+0x63/0xcd Tested-by: Tony Luck <tony.luck@intel.com> -Tony
On 2022/12/3 12:22, Tony Luck wrote: > On Sat, Dec 03, 2022 at 10:19:32AM +0800, Miaohe Lin wrote: >> So I think it's better to have at least one stack dumps. Also what the commit >> 6e6f0a1f0fa6 ("panic: don't print redundant backtraces on oops") and commit >> 026ee1f66aaa ("panic: fix stack dump print on direct call to panic()") want >> to do is avoiding nested stack-dumping to have the original oops data being >> scrolled away on a 80x50 screen but to have *at least one backtraces*. So >> this patch acts more like a BUGFIX to ensure having at least one backtraces >> in mce_panic(). What's your thought, Luck? > > I tried out your patch with the ras-tools test: ras-tool is really convenient. :) > > # ./einj_mem_uc -f copyout > > which currently causes a panic from the "recoverable" machine check. > > Your patch worked fine: > > [ 112.457735] stack backtrace: > [ 112.457736] CPU: 124 PID: 3401 Comm: einj_mem_uc Not tainted 6.1.0-rc7+ #41 > [ 112.457738] Hardware name: Intel Corporation PURLEY/PURLEY, BIOS PLYDCRB1.86B.0154.R04.1804231104 04/23/2018 > [ 112.457739] Call Trace: > [ 112.457740] <#MC> > [ 112.457742] dump_stack_lvl+0x5a/0x78 > [ 112.457746] dump_stack+0x10/0x16 > [ 112.457748] print_usage_bug.part.0+0x1ad/0x1c4 > [ 112.457755] lock_acquire.cold+0x16/0x47 > [ 112.457759] ? down_trylock+0x14/0x40 > [ 112.457762] ? panic+0x180/0x2b9 > [ 112.457766] _raw_spin_lock_irqsave+0x4e/0x70 > [ 112.457768] ? down_trylock+0x14/0x40 > [ 112.457771] down_trylock+0x14/0x40 > [ 112.457772] ? panic+0x180/0x2b9 > [ 112.457775] __down_trylock_console_sem+0x34/0xc0 > [ 112.457778] console_unblank+0x1d/0x90 > [ 112.457781] panic+0x180/0x2b9 > [ 112.457788] mce_panic+0x118/0x1e0 > [ 112.457794] do_machine_check+0x79a/0x890 > [ 112.457800] ? copy_user_enhanced_fast_string+0xa/0x50 > [ 112.457810] exc_machine_check+0x76/0xb0 > [ 112.457813] asm_exc_machine_check+0x1a/0x40 > [ 112.457816] RIP: 0010:copy_user_enhanced_fast_string+0xa/0x50 > [ 112.457819] Code: d1 f3 a4 31 c0 0f 01 ca c3 cc cc cc cc 8d 0c ca 89 ca eb 2c 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 01 cb 83 fa 40 72 48 89 d1 <f3> a4 31 c0 0f 01 ca c3 cc cc cc cc 89 ca eb 06 66 0f 1f 44 00 00 > [ 112.457820] RSP: 0018:ffffb140f789bbd8 EFLAGS: 00050206 > [ 112.457822] RAX: 0000000000001000 RBX: ffffb140f789be58 RCX: 0000000000000c00 > [ 112.457824] RDX: 0000000000001000 RSI: ffff9133eb3f0400 RDI: 000055c1a36986c0 > [ 112.457825] RBP: ffffb140f789bc68 R08: 00000000f789be01 R09: 0000000000001000 > [ 112.457827] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000001000 > [ 112.457828] R13: ffff9133eb3f0000 R14: 0000000000001000 R15: 0000000000000000 > [ 112.457837] </#MC> > [ 112.457838] <TASK> > [ 112.457838] ? _copy_to_iter+0xc3/0x6f0 > [ 112.457843] ? filemap_get_pages+0x9b/0x670 > [ 112.457851] copy_page_to_iter+0x7c/0x1f0 > [ 112.457854] ? find_held_lock+0x31/0x90 > [ 112.457858] filemap_read+0x1ec/0x390 > [ 112.457865] ? __fsnotify_parent+0x10f/0x310 > [ 112.457867] ? aa_file_perm+0x1ab/0x610 > [ 112.457875] generic_file_read_iter+0xf4/0x170 > [ 112.457879] ext4_file_read_iter+0x5b/0x1e0 > [ 112.457881] ? security_file_permission+0x4e/0x60 > [ 112.457886] vfs_read+0x208/0x2e0 > [ 112.457895] ksys_read+0x6d/0xf0 > [ 112.457900] __x64_sys_read+0x19/0x20 > [ 112.457902] do_syscall_64+0x38/0x90 > [ 112.457906] entry_SYSCALL_64_after_hwframe+0x63/0xcd > > > Tested-by: Tony Luck <tony.luck@intel.com> > > -Tony Many thanks for your test, Tony! Thanks, Miaohe Lin
On 2022/12/3 0:37, Miaohe Lin wrote: > When machine check exception occurs, there is no stack-dumping now in > mce_panic(). It's because bust_spinlocks(1) is called prematurely so > oops_in_progress will be >= 2 when trying to call dump_stack() in > panic(). Thus dump_stack() won't be called as this is considered as > nested stack-dumping. > > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Friendly ping. ;) > --- > arch/x86/kernel/cpu/mce/core.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c > index 2c8ec5c71712..c40dad1a6749 100644 > --- a/arch/x86/kernel/cpu/mce/core.c > +++ b/arch/x86/kernel/cpu/mce/core.c > @@ -254,7 +254,6 @@ static noinstr void mce_panic(const char *msg, struct mce *final, char *exp) > wait_for_panic(); > barrier(); > > - bust_spinlocks(1); > console_verbose(); > } else { > /* Don't log too much for fake panic */ >
On 2022/12/10 10:28, Miaohe Lin wrote: > On 2022/12/3 0:37, Miaohe Lin wrote: >> When machine check exception occurs, there is no stack-dumping now in >> mce_panic(). It's because bust_spinlocks(1) is called prematurely so >> oops_in_progress will be >= 2 when trying to call dump_stack() in >> panic(). Thus dump_stack() won't be called as this is considered as >> nested stack-dumping. >> >> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > > Friendly ping. ;) Friendly ping after busy merge window. :) > >> --- >> arch/x86/kernel/cpu/mce/core.c | 1 - >> 1 file changed, 1 deletion(-) >> >> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c >> index 2c8ec5c71712..c40dad1a6749 100644 >> --- a/arch/x86/kernel/cpu/mce/core.c >> +++ b/arch/x86/kernel/cpu/mce/core.c >> @@ -254,7 +254,6 @@ static noinstr void mce_panic(const char *msg, struct mce *final, char *exp) >> wait_for_panic(); >> barrier(); >> >> - bust_spinlocks(1); >> console_verbose(); >> } else { >> /* Don't log too much for fake panic */ >> >
On Sat, Dec 03, 2022 at 10:19:32AM +0800, Miaohe Lin wrote: > So I think it's better to have at least one stack dumps. Also what the commit > 6e6f0a1f0fa6 ("panic: don't print redundant backtraces on oops") and commit > 026ee1f66aaa ("panic: fix stack dump print on direct call to panic()") want > to do is avoiding nested stack-dumping to have the original oops data being > scrolled away on a 80x50 screen but to have *at least one backtraces*. So > this patch acts more like a BUGFIX to ensure having at least one backtraces > in mce_panic(). So which commit broke this? One of the two above or 004429956b48 ("handle recursive calls to bust_spinlocks()") or d896a940ef4f ("x86, mce: remove oops_begin() use in 64bit machine check") or...? By looking at their dates, they're pretty much too old so that this can go to *all* stable kernels. Hmmm.
On 2022/12/29 19:54, Borislav Petkov wrote: > On Sat, Dec 03, 2022 at 10:19:32AM +0800, Miaohe Lin wrote: >> So I think it's better to have at least one stack dumps. Also what the commit >> 6e6f0a1f0fa6 ("panic: don't print redundant backtraces on oops") and commit >> 026ee1f66aaa ("panic: fix stack dump print on direct call to panic()") want >> to do is avoiding nested stack-dumping to have the original oops data being >> scrolled away on a 80x50 screen but to have *at least one backtraces*. So >> this patch acts more like a BUGFIX to ensure having at least one backtraces >> in mce_panic(). > Many thanks for your reply. :) > So which commit broke this? > I think it should be Fixes: 6e6f0a1f0fa6 ("panic: don't print redundant backtraces on oops") as there's missing stack-dumping in mce_panic() since then. Should I resend the patch with above Fixes tag attached? > One of the two above or > > 004429956b48 ("handle recursive calls to bust_spinlocks()") > > or > > d896a940ef4f ("x86, mce: remove oops_begin() use in 64bit machine check") > > or...? > > By looking at their dates, they're pretty much too old so that this can > go to *all* stable kernels. Yes, it should be a really old problem. And it seems it's unnoticed yet because mce should be a really rare event and corresponding dump-stacking might not be that attractive at that time. Thanks! Miaohe Lin
On Thu, Dec 29, 2022 at 08:33:55PM +0800, Miaohe Lin wrote:
> I think
You think or you know?
I'm assuming reverting 6e6f0a1f0fa6 shows the MCE panic?
I guess the original issue that commit was fixing is to save that
redundant oops message but Tony seems to want to see it now and I'm not
sure how much we care about 80x50 screens nowadays... :-)
On 2022/12/29 20:51, Borislav Petkov wrote: > On Thu, Dec 29, 2022 at 08:33:55PM +0800, Miaohe Lin wrote: >> I think > > You think or you know? The problem code logic exists since that time. > > I'm assuming reverting 6e6f0a1f0fa6 shows the MCE panic? Yes, that's right. > > I guess the original issue that commit was fixing is to save that > redundant oops message but Tony seems to want to see it now and I'm not > sure how much we care about 80x50 screens nowadays... :-) 80x50 screens should be really uncommon now. But avoiding redundant oops message should always save our time. ;) Thanks, Miaohe Lin
>> I guess the original issue that commit was fixing is to save that >> redundant oops message but Tony seems to want to see it now and I'm not >> sure how much we care about 80x50 screens nowadays... :-) I want a stack dump for the specific case of a recoverable machine check caused by poison consumption in kernel code that doesn't have an extable[] entry for a recovery path. That's a potential candidate for future kernel change to make that recoverable (if the code path seems common enough to warrant the churn), and there is some plausible way for s/w to "recover"). For most other machine checks the dump is very likely useless. E.g. some CPU core stalled so that the system generates a broadcast machine check because instructions are not being retired on that CPU core. In this case the machine check "monarch" is almost certainly some innocent bystander that was executing normally. Stack dump from that CPU is going to tell you nothing about the machine check. -Tony
On 2023/1/4 5:12, Luck, Tony wrote: >>> I guess the original issue that commit was fixing is to save that >>> redundant oops message but Tony seems to want to see it now and I'm not >>> sure how much we care about 80x50 screens nowadays... :-) > Many thanks for your thought. :) > I want a stack dump for the specific case of a recoverable machine check caused by > poison consumption in kernel code that doesn't have an extable[] entry for a recovery > path. That's a potential candidate for future kernel change to make that recoverable Sure, in this case a stack dump will be really helpful. We can gather these stack dumps and try to make the most frequent machine check scene recoverable. > (if the code path seems common enough to warrant the churn), and there is some > plausible way for s/w to "recover"). > > For most other machine checks the dump is very likely useless. E.g. some CPU core stalled > so that the system generates a broadcast machine check because instructions are not > being retired on that CPU core. In this case the machine check "monarch" is almost certainly > some innocent bystander that was executing normally. Stack dump from that CPU is going > to tell you nothing about the machine check. Agree. A stack dump won't be helpful in this case. But I tend to keep the stack dump in case it would be helpful and also make mce_panic() dumps the stack as expected. What do you think? Thanks, Miaohe Lin
> Agree. A stack dump won't be helpful in this case. But I tend to keep the stack dump in case > it would be helpful and also make mce_panic() dumps the stack as expected. What do you think? My #1 preference is to just get stack dumps from machine checks where the cause is known to be poison consumption by the kernel. Patches to do this were posted in September 2022 https://lore.kernel.org/all/20220922195136.54575-1-tony.luck@intel.com/ Boris has applied part 1 (which was a cleanup). But discussion on part 2 centered around why mce_panic() wasn't giving a full OOPs dump. That's now understood. If I can't get my first preference ... then a stack dump for every machine check is better than no stack dumps at all. So a patch that enables the OOPs dump would be a good second choice. I don't want to stick with the current behavior of no stack dumps. -Tony
On 2023/1/7 1:42, Luck, Tony wrote: >> Agree. A stack dump won't be helpful in this case. But I tend to keep the stack dump in case >> it would be helpful and also make mce_panic() dumps the stack as expected. What do you think? > > My #1 preference is to just get stack dumps from machine checks where the cause is known > to be poison consumption by the kernel. > > Patches to do this were posted in September 2022 > > https://lore.kernel.org/all/20220922195136.54575-1-tony.luck@intel.com/ [1] > > Boris has applied part 1 (which was a cleanup). But discussion on part 2 > centered around why mce_panic() wasn't giving a full OOPs dump. That's > now understood. > Many thanks for your valuable information. [1] uses MCE_PANIC_STACKDUMP_SEVERITY to pick the preferred ones and do a stack dump manually. But now stack dumps is done via panic(), it seems there's no easy way to achieve #1 preference. > > If I can't get my first preference ... then a stack dump for every machine check > is better than no stack dumps at all. So a patch that enables the OOPs dump > would be a good second choice. > > I don't want to stick with the current behavior of no stack dumps. As a first step, let's change the current behavior of no stack dumps first? And then, we could try to achieve #1 preference if possible. Thanks, Miaohe Lin
> As a first step, let's change the current behavior of no stack dumps first? And then, > we could try to achieve #1 preference if possible. Ok. Repost the patch that fixes things for an oops dump for every mce_panic(). Maybe then I can look at some way to suppress the oops dump for the "useless" cases? Or perhaps at least add an extra message that stack dumps from arbitrary machine checks are rarely useful. -Tony
On 2023/1/9 1:54, Luck, Tony wrote: >> As a first step, let's change the current behavior of no stack dumps first? And then, >> we could try to achieve #1 preference if possible. > > Ok. Repost the patch that fixes things for an oops dump for every mce_panic(). Do you mean make every call site of mce_panic() does a successful oops dump? I think this patch will have achieved that. Or do I miss-read your comment? > > Maybe then I can look at some way to suppress the oops dump for the "useless" > cases? Or perhaps at least add an extra message that stack dumps from arbitrary > machine checks are rarely useful. IMHO the user will always expect a stack dump when system panic. I'm afraid that if these "useless" cases are eliminated, it will cause user confused and trying to find what eats the stack dump? Thanks, Miaohe Lin
>> Maybe then I can look at some way to suppress the oops dump for the "useless" >> cases? Or perhaps at least add an extra message that stack dumps from arbitrary >> machine checks are rarely useful. > > IMHO the user will always expect a stack dump when system panic. I'm afraid that if these > "useless" cases are eliminated, it will cause user confused and trying to find what eats > the stack dump? What about the "extra message" option? Always print the full oops. but for ones from mce_panic() arrange somehow to add an extra bit of helpful text. Something like this: 1) For poison consumption in kernel code: "The reason for this panic is consumption of poisoned data by the kernel" 2) For the general machine check case: "The reason for this panic is a hardware detected error reported as a machine check. Note that the stack and register dumps are likely useless to diagnose the root cause of this panic." -Tony
On 2023/1/9 12:45, Luck, Tony wrote: >>> Maybe then I can look at some way to suppress the oops dump for the "useless" >>> cases? Or perhaps at least add an extra message that stack dumps from arbitrary >>> machine checks are rarely useful. >> >> IMHO the user will always expect a stack dump when system panic. I'm afraid that if these >> "useless" cases are eliminated, it will cause user confused and trying to find what eats >> the stack dump? > > What about the "extra message" option? Always print the full oops. but for ones from > mce_panic() arrange somehow to add an extra bit of helpful text. Something like this: > > 1) For poison consumption in kernel code: > > "The reason for this panic is consumption of poisoned data by the kernel" > > 2) For the general machine check case: > > "The reason for this panic is a hardware detected error reported as a machine check. > Note that the stack and register dumps are likely useless to diagnose the root cause > of this panic." It sounds like a good idea to me. Many thanks for your thought. And it's fine to squash my patch to your provided thought. :) Thanks, Miaohe Lin
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 2c8ec5c71712..c40dad1a6749 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -254,7 +254,6 @@ static noinstr void mce_panic(const char *msg, struct mce *final, char *exp) wait_for_panic(); barrier(); - bust_spinlocks(1); console_verbose(); } else { /* Don't log too much for fake panic */
When machine check exception occurs, there is no stack-dumping now in mce_panic(). It's because bust_spinlocks(1) is called prematurely so oops_in_progress will be >= 2 when trying to call dump_stack() in panic(). Thus dump_stack() won't be called as this is considered as nested stack-dumping. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> --- arch/x86/kernel/cpu/mce/core.c | 1 - 1 file changed, 1 deletion(-)