Message ID | 20220411093819.1012583-1-sumit.garg@linaro.org (mailing list archive) |
---|---|
Headers | show |
Series | arm64: kgdb/kdb: Fix pending single-step debugging issues | expand |
Hi, On Mon, Apr 11, 2022 at 2:38 AM Sumit Garg <sumit.garg@linaro.org> wrote: > > This patch-set reworks pending fixes from Wei's series [1] to make > single-step debugging via kgdb/kdb on arm64 work as expected. There was > a prior discussion on ML [2] regarding if we should keep the interrupts > enabled during single-stepping but it turns out that in case of kgdb, it > is risky to enable interrupts as sometimes a resume after single > stepping an interrupt handler leads to following unbalanced locking > issue: > > [ 300.328300] WARNING: bad unlock balance detected! > [ 300.328608] 5.18.0-rc1-00016-g3e732ebf7316-dirty #6 Not tainted > [ 300.329058] ------------------------------------- > [ 300.329298] sh/173 is trying to release lock (dbg_slave_lock) at: > [ 300.329718] [<ffffd57c951c016c>] kgdb_cpu_enter+0x7ac/0x820 > [ 300.330029] but there are no more locks to release! > [ 300.330265] > [ 300.330265] other info that might help us debug this: > [ 300.330668] 4 locks held by sh/173: > [ 300.330891] #0: ffff4f5e454d8438 (sb_writers#3){.+.+}-{0:0}, at: vfs_write+0x98/0x204 > [ 300.331735] #1: ffffd57c973bc2f0 (dbg_slave_lock){+.+.}-{2:2}, at: kgdb_cpu_enter+0x5b4/0x820 > [ 300.332259] #2: ffffd57c973a9460 (rcu_read_lock){....}-{1:2}, at: kgdb_cpu_enter+0xe0/0x820 > [ 300.332717] #3: ffffd57c973bc2a8 (dbg_master_lock){....}-{2:2}, at: kgdb_cpu_enter+0x1ec/0x820 > > So, I choose to keep interrupts disabled specifically for kgdb. This > series has been rebased to Linux 5.18-rc1 and I have dropped Doug's > review and test tags as there is significant rework involved. Hmmmm. I guess it's really up to Will here, but re-reading his previous email made it pretty clear that he wasn't willing to land a solution that he wasn't willing to land a solution that left interrupts disabled during step. He also pointed out some things that would actually be broken, like single-stepping over a call to irqs_disabled() or single stepping over something that caused an exception where the exception handler needed interrupts enabled. I thought he had a proposal at: https://lore.kernel.org/r/20200626095551.GA9312@willie-the-truck ...that was supposed to make all the problems go away and it was just that nobody had time to implement his proposal? > [1] https://lore.kernel.org/all/20200509214159.19680-1-liwei391@huawei.com/ > [2] https://lore.kernel.org/all/CAD=FV=Voyfq3Qz0T3RY+aYWYJ0utdH=P_AweB=13rcV8GDBeyQ@mail.gmail.com/ > > Sumit Garg (2): > arm64: kgdb: Fix incorrect single stepping into the irq handler > arm64: kgdb: Set PSTATE.SS to 1 to re-enable single-step > > arch/arm64/include/asm/debug-monitors.h | 1 + > arch/arm64/kernel/debug-monitors.c | 5 ++++ > arch/arm64/kernel/kgdb.c | 35 +++++++++++++++++++++++-- > 3 files changed, 39 insertions(+), 2 deletions(-) > > -- > 2.25.1 >
Hi Doug, Thanks for looking into this patch-set. On Tue, 12 Apr 2022 at 05:39, Doug Anderson <dianders@chromium.org> wrote: > > Hi, > > On Mon, Apr 11, 2022 at 2:38 AM Sumit Garg <sumit.garg@linaro.org> wrote: > > > > This patch-set reworks pending fixes from Wei's series [1] to make > > single-step debugging via kgdb/kdb on arm64 work as expected. There was > > a prior discussion on ML [2] regarding if we should keep the interrupts > > enabled during single-stepping but it turns out that in case of kgdb, it > > is risky to enable interrupts as sometimes a resume after single > > stepping an interrupt handler leads to following unbalanced locking > > issue: > > > > [ 300.328300] WARNING: bad unlock balance detected! > > [ 300.328608] 5.18.0-rc1-00016-g3e732ebf7316-dirty #6 Not tainted > > [ 300.329058] ------------------------------------- > > [ 300.329298] sh/173 is trying to release lock (dbg_slave_lock) at: > > [ 300.329718] [<ffffd57c951c016c>] kgdb_cpu_enter+0x7ac/0x820 > > [ 300.330029] but there are no more locks to release! > > [ 300.330265] > > [ 300.330265] other info that might help us debug this: > > [ 300.330668] 4 locks held by sh/173: > > [ 300.330891] #0: ffff4f5e454d8438 (sb_writers#3){.+.+}-{0:0}, at: vfs_write+0x98/0x204 > > [ 300.331735] #1: ffffd57c973bc2f0 (dbg_slave_lock){+.+.}-{2:2}, at: kgdb_cpu_enter+0x5b4/0x820 > > [ 300.332259] #2: ffffd57c973a9460 (rcu_read_lock){....}-{1:2}, at: kgdb_cpu_enter+0xe0/0x820 > > [ 300.332717] #3: ffffd57c973bc2a8 (dbg_master_lock){....}-{2:2}, at: kgdb_cpu_enter+0x1ec/0x820 > > > > So, I choose to keep interrupts disabled specifically for kgdb. This > > series has been rebased to Linux 5.18-rc1 and I have dropped Doug's > > review and test tags as there is significant rework involved. > > Hmmmm. I guess it's really up to Will here, but re-reading his > previous email made it pretty clear that he wasn't willing to land a > solution that he wasn't willing to land a solution that left > interrupts disabled during step. He also pointed out some things that > would actually be broken, like single-stepping over a call to > irqs_disabled() or single stepping over something that caused an > exception where the exception handler needed interrupts enabled. > > I thought he had a proposal at: > > https://lore.kernel.org/r/20200626095551.GA9312@willie-the-truck > > ...that was supposed to make all the problems go away and it was just > that nobody had time to implement his proposal? > So I took a shot at Will's proposal as a replacement of patch #1 in v2 [1]. I hope that it is aligned with Will's thinking. [1] https://lkml.org/lkml/2022/4/13/136 -Sumit > > > [1] https://lore.kernel.org/all/20200509214159.19680-1-liwei391@huawei.com/ > > [2] https://lore.kernel.org/all/CAD=FV=Voyfq3Qz0T3RY+aYWYJ0utdH=P_AweB=13rcV8GDBeyQ@mail.gmail.com/ > > > > Sumit Garg (2): > > arm64: kgdb: Fix incorrect single stepping into the irq handler > > arm64: kgdb: Set PSTATE.SS to 1 to re-enable single-step > > > > arch/arm64/include/asm/debug-monitors.h | 1 + > > arch/arm64/kernel/debug-monitors.c | 5 ++++ > > arch/arm64/kernel/kgdb.c | 35 +++++++++++++++++++++++-- > > 3 files changed, 39 insertions(+), 2 deletions(-) > > > > -- > > 2.25.1 > >