tcg: gdbstub: Fix single-step issue on arm target

Message ID	20200220155834.21905-1-changbin.du@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=7FAs=4I=nongnu.org=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7424720659 From: Changbin Du <changbin.du@gmail.com> To: alex.bennee@linaro.org, philmd@redhat.com Subject: [PATCH] tcg: gdbstub: Fix single-step issue on arm target Date: Thu, 20 Feb 2020 23:58:34 +0800 Message-Id: <20200220155834.21905-1-changbin.du@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: list Cc: qemu-devel@nongnu.org, Changbin Du <changbin.du@gmail.com> Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>
Series	tcg: gdbstub: Fix single-step issue on arm target \| expand tcg: gdbstub: Fix single-step issue on arm target

Changbin Du Feb. 20, 2020, 3:58 p.m. UTC

Recently when debugging an arm32 system on qemu, I found sometimes the
single-step command (stepi) is not working. This can be reproduced by
below steps:
 1) start qemu-system-arm -s -S .. and wait for gdb connection.
 2) start gdb and connect to qemu. In my case, gdb gets a wrong value
    (0x60) for PC.
 3) After connected, type 'stepi' and expect it will stop at next ins.

But, it has never stopped. This because:
 1) We doesn't report ‘vContSupported’ feature to gdb explicitly and gdb
    think we do not support it. In this case, gdb use a software breakpoint
    to emulate single-step.
 2) Since gdb gets a wrong initial value of PC, then gdb inserts a
    breakpoint to wrong place (PC+4).

Since we do support ‘vContSupported’ query command, so let's tell gdb that
we support it.

Before this change, gdb send below 'Z0' packet to implement single-step:
gdb_handle_packet: Z0,4,4

After this change, gdb send "vCont;s.." which is expected:
gdb_handle_packet: vCont?
put_packet: vCont;c;C;s;S
gdb_handle_packet: vCont;s:p1.1;c:p1.-1

Signed-off-by: Changbin Du <changbin.du@gmail.com>
---
 gdbstub.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Philippe Mathieu-Daudé Feb. 20, 2020, 5:47 p.m. UTC | #1

On 2/20/20 4:58 PM, Changbin Du wrote:
> Recently when debugging an arm32 system on qemu, I found sometimes the
> single-step command (stepi) is not working. This can be reproduced by
> below steps:
>   1) start qemu-system-arm -s -S .. and wait for gdb connection.
>   2) start gdb and connect to qemu. In my case, gdb gets a wrong value
>      (0x60) for PC.
>   3) After connected, type 'stepi' and expect it will stop at next ins.
> 
> But, it has never stopped. This because:
>   1) We doesn't report ‘vContSupported’ feature to gdb explicitly and gdb
>      think we do not support it. In this case, gdb use a software breakpoint
>      to emulate single-step.
>   2) Since gdb gets a wrong initial value of PC, then gdb inserts a
>      breakpoint to wrong place (PC+4).
> 
> Since we do support ‘vContSupported’ query command, so let's tell gdb that
> we support it.
> 
> Before this change, gdb send below 'Z0' packet to implement single-step:
> gdb_handle_packet: Z0,4,4
> 
> After this change, gdb send "vCont;s.." which is expected:
> gdb_handle_packet: vCont?
> put_packet: vCont;c;C;s;S
> gdb_handle_packet: vCont;s:p1.1;c:p1.-1

You actually fixed this for all architectures :)

This has been annoying me on MIPS since more than a year...

I haven't checked the GDB protocol spec, but so far:
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>

> 
> Signed-off-by: Changbin Du <changbin.du@gmail.com>
> ---
>   gdbstub.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gdbstub.c b/gdbstub.c
> index ce304ff482..adccd938e2 100644
> --- a/gdbstub.c
> +++ b/gdbstub.c
> @@ -2111,7 +2111,7 @@ static void handle_query_supported(GdbCmdContext *gdb_ctx, void *user_ctx)
>           gdb_ctx->s->multiprocess = true;
>       }
>   
> -    pstrcat(gdb_ctx->str_buf, sizeof(gdb_ctx->str_buf), ";multiprocess+");
> +    pstrcat(gdb_ctx->str_buf, sizeof(gdb_ctx->str_buf), ";vContSupported+;multiprocess+");
>       put_packet(gdb_ctx->s, gdb_ctx->str_buf);
>   }
>   
>

Peter Maydell Feb. 20, 2020, 5:58 p.m. UTC | #2

On Thu, 20 Feb 2020 at 15:59, Changbin Du <changbin.du@gmail.com> wrote:
>
> Recently when debugging an arm32 system on qemu, I found sometimes the
> single-step command (stepi) is not working. This can be reproduced by
> below steps:
>  1) start qemu-system-arm -s -S .. and wait for gdb connection.
>  2) start gdb and connect to qemu. In my case, gdb gets a wrong value
>     (0x60) for PC.
>  3) After connected, type 'stepi' and expect it will stop at next ins.
>
> But, it has never stopped. This because:
>  1) We doesn't report ‘vContSupported’ feature to gdb explicitly and gdb
>     think we do not support it. In this case, gdb use a software breakpoint
>     to emulate single-step.
>  2) Since gdb gets a wrong initial value of PC, then gdb inserts a
>     breakpoint to wrong place (PC+4).
>
> Since we do support ‘vContSupported’ query command, so let's tell gdb that
> we support it.
>
> Before this change, gdb send below 'Z0' packet to implement single-step:
> gdb_handle_packet: Z0,4,4
>
> After this change, gdb send "vCont;s.." which is expected:
> gdb_handle_packet: vCont?
> put_packet: vCont;c;C;s;S
> gdb_handle_packet: vCont;s:p1.1;c:p1.-1
>
> Signed-off-by: Changbin Du <changbin.du@gmail.com>

Certainly if we support vCont we should advertise it. But why
does the fallback path not work? That is, why does gdb get a
wrong PC value initially?

thanks
-- PMM

Laurent Vivier Feb. 20, 2020, 6:06 p.m. UTC | #3

Le 20/02/2020 à 18:47, Philippe Mathieu-Daudé a écrit :
> On 2/20/20 4:58 PM, Changbin Du wrote:
>> Recently when debugging an arm32 system on qemu, I found sometimes the
>> single-step command (stepi) is not working. This can be reproduced by
>> below steps:
>>   1) start qemu-system-arm -s -S .. and wait for gdb connection.
>>   2) start gdb and connect to qemu. In my case, gdb gets a wrong value
>>      (0x60) for PC.
>>   3) After connected, type 'stepi' and expect it will stop at next ins.
>>
>> But, it has never stopped. This because:
>>   1) We doesn't report ‘vContSupported’ feature to gdb explicitly and gdb
>>      think we do not support it. In this case, gdb use a software
>> breakpoint
>>      to emulate single-step.
>>   2) Since gdb gets a wrong initial value of PC, then gdb inserts a
>>      breakpoint to wrong place (PC+4).
>>
>> Since we do support ‘vContSupported’ query command, so let's tell gdb
>> that
>> we support it.
>>
>> Before this change, gdb send below 'Z0' packet to implement single-step:
>> gdb_handle_packet: Z0,4,4
>>
>> After this change, gdb send "vCont;s.." which is expected:
>> gdb_handle_packet: vCont?
>> put_packet: vCont;c;C;s;S
>> gdb_handle_packet: vCont;s:p1.1;c:p1.-1
> 
> You actually fixed this for all architectures :)
> 
> This has been annoying me on MIPS since more than a year...

The problem started with an update of QEMU or of GDB?

At one point it seemed to work, so what happened?

Thanks,
Laurent

Philippe Mathieu-Daudé Feb. 20, 2020, 6:55 p.m. UTC | #4

On 2/20/20 7:06 PM, Laurent Vivier wrote:
> Le 20/02/2020 à 18:47, Philippe Mathieu-Daudé a écrit :
>> On 2/20/20 4:58 PM, Changbin Du wrote:
>>> Recently when debugging an arm32 system on qemu, I found sometimes the
>>> single-step command (stepi) is not working. This can be reproduced by
>>> below steps:
>>>    1) start qemu-system-arm -s -S .. and wait for gdb connection.
>>>    2) start gdb and connect to qemu. In my case, gdb gets a wrong value
>>>       (0x60) for PC.
>>>    3) After connected, type 'stepi' and expect it will stop at next ins.
>>>
>>> But, it has never stopped. This because:
>>>    1) We doesn't report ‘vContSupported’ feature to gdb explicitly and gdb
>>>       think we do not support it. In this case, gdb use a software
>>> breakpoint
>>>       to emulate single-step.
>>>    2) Since gdb gets a wrong initial value of PC, then gdb inserts a
>>>       breakpoint to wrong place (PC+4).
>>>
>>> Since we do support ‘vContSupported’ query command, so let's tell gdb
>>> that
>>> we support it.
>>>
>>> Before this change, gdb send below 'Z0' packet to implement single-step:
>>> gdb_handle_packet: Z0,4,4
>>>
>>> After this change, gdb send "vCont;s.." which is expected:
>>> gdb_handle_packet: vCont?
>>> put_packet: vCont;c;C;s;S
>>> gdb_handle_packet: vCont;s:p1.1;c:p1.-1
>>
>> You actually fixed this for all architectures :)
>>
>> This has been annoying me on MIPS since more than a year...
> 
> The problem started with an update of QEMU or of GDB?
> 
> At one point it seemed to work, so what happened?

I'd say gdb. I can try different combinations of QEMU/gdb but I won't do 
that soon.

Luc Michel Feb. 20, 2020, 9:24 p.m. UTC | #5

Hi,

On 2/20/20 4:58 PM, Changbin Du wrote:
> Recently when debugging an arm32 system on qemu, I found sometimes the
> single-step command (stepi) is not working. This can be reproduced by
> below steps:
>  1) start qemu-system-arm -s -S .. and wait for gdb connection.
>  2) start gdb and connect to qemu. In my case, gdb gets a wrong value
>     (0x60) for PC.
>  3) After connected, type 'stepi' and expect it will stop at next ins.
> 
> But, it has never stopped. This because:
>  1) We doesn't report ‘vContSupported’ feature to gdb explicitly and gdb
>     think we do not support it. In this case, gdb use a software breakpoint
>     to emulate single-step.
>  2) Since gdb gets a wrong initial value of PC, then gdb inserts a
>     breakpoint to wrong place (PC+4).
> 
> Since we do support ‘vContSupported’ query command, so let's tell gdb that
> we support it.
> 
> Before this change, gdb send below 'Z0' packet to implement single-step:
> gdb_handle_packet: Z0,4,4
> 
> After this change, gdb send "vCont;s.." which is expected:
> gdb_handle_packet: vCont?
> put_packet: vCont;c;C;s;S
> gdb_handle_packet: vCont;s:p1.1;c:p1.-1
I'm curious, I never experienced this behaviour from GDB. What GDB and
QEMU versions are you using?

On my side (GDB 9.1), even without 'vContSupported+' in the 'qSupported'
answer, GDB sends a 'vCont?' packet on the first stepi:

0x00000000 in ?? ()
(gdb) si
Sending packet: $m0,4#fd...Ack
Packet received: 00000000
Sending packet: $vCont?#49...Ack
Packet received: vCont;c;C;s;S
Packet vCont (verbose-resume) is supported
Sending packet: $vCont;s:p1.1;c:p1.-1#f7...Ack
Packet received: T05thread:p01.01;

Your second issue (wrong PC value) should be investigated though. Does
it happen on QEMU vanilla? Do you have a way to reproduce this bug?

Anyway after re-reading the GDB remote protocol documentation, I think
your patch is right, the feature should be advertised.

However I think your commit message needs some modifications. This fix
is not specific to ARM or TCG, but to the gdbstub itself. You also
mention this bug you have with PC, which is not related to the bug you
are fixing here. Could you rewrite it in a more generic way? You simply
need to emphasis the effect of advertising the 'vContSupported+' feature
on GDB.

Thanks.

Changbin Du Feb. 21, 2020, 12:07 a.m. UTC | #6

On Thu, Feb 20, 2020 at 10:24:37PM +0100, Luc Michel wrote:
> Hi,
> 
> On 2/20/20 4:58 PM, Changbin Du wrote:
> > Recently when debugging an arm32 system on qemu, I found sometimes the
> > single-step command (stepi) is not working. This can be reproduced by
> > below steps:
> >  1) start qemu-system-arm -s -S .. and wait for gdb connection.
> >  2) start gdb and connect to qemu. In my case, gdb gets a wrong value
> >     (0x60) for PC.
> >  3) After connected, type 'stepi' and expect it will stop at next ins.
> > 
> > But, it has never stopped. This because:
> >  1) We doesn't report ‘vContSupported’ feature to gdb explicitly and gdb
> >     think we do not support it. In this case, gdb use a software breakpoint
> >     to emulate single-step.
> >  2) Since gdb gets a wrong initial value of PC, then gdb inserts a
> >     breakpoint to wrong place (PC+4).
> > 
> > Since we do support ‘vContSupported’ query command, so let's tell gdb that
> > we support it.
> > 
> > Before this change, gdb send below 'Z0' packet to implement single-step:
> > gdb_handle_packet: Z0,4,4
> > 
> > After this change, gdb send "vCont;s.." which is expected:
> > gdb_handle_packet: vCont?
> > put_packet: vCont;c;C;s;S
> > gdb_handle_packet: vCont;s:p1.1;c:p1.-1
> I'm curious, I never experienced this behaviour from GDB. What GDB and
> QEMU versions are you using?
> 
For QEMU, it's built from mainline.
For GDB, I have tried 8.1 and latest 9.1.

> On my side (GDB 9.1), even without 'vContSupported+' in the 'qSupported'
> answer, GDB sends a 'vCont?' packet on the first stepi:
> 
> 0x00000000 in ?? ()
> (gdb) si
> Sending packet: $m0,4#fd...Ack
> Packet received: 00000000
> Sending packet: $vCont?#49...Ack
> Packet received: vCont;c;C;s;S
> Packet vCont (verbose-resume) is supported
> Sending packet: $vCont;s:p1.1;c:p1.-1#f7...Ack
> Packet received: T05thread:p01.01;
>
hmm, On my side, this is 100% reproducable on arm32, but aarch64 doesn't. I
think the GDB has different assumptions for different arch.

> Your second issue (wrong PC value) should be investigated though. Does
> it happen on QEMU vanilla? Do you have a way to reproduce this bug?
> 
This is also 100% reproducable for my tested elf guest. But so sorry that I
can't share it. Probablly I will check this issue some days later.

> Anyway after re-reading the GDB remote protocol documentation, I think
> your patch is right, the feature should be advertised.
> 
> However I think your commit message needs some modifications. This fix
> is not specific to ARM or TCG, but to the gdbstub itself. You also
> mention this bug you have with PC, which is not related to the bug you
> are fixing here. Could you rewrite it in a more generic way? You simply
> need to emphasis the effect of advertising the 'vContSupported+' feature
> on GDB.
> 
sure.

> Thanks.
> 
> -- 
> Luc

Changbin Du Feb. 21, 2020, 12:08 a.m. UTC | #7

On Thu, Feb 20, 2020 at 06:47:26PM +0100, Philippe Mathieu-Daudé wrote:
> On 2/20/20 4:58 PM, Changbin Du wrote:
> > Recently when debugging an arm32 system on qemu, I found sometimes the
> > single-step command (stepi) is not working. This can be reproduced by
> > below steps:
> >   1) start qemu-system-arm -s -S .. and wait for gdb connection.
> >   2) start gdb and connect to qemu. In my case, gdb gets a wrong value
> >      (0x60) for PC.
> >   3) After connected, type 'stepi' and expect it will stop at next ins.
> > 
> > But, it has never stopped. This because:
> >   1) We doesn't report ‘vContSupported’ feature to gdb explicitly and gdb
> >      think we do not support it. In this case, gdb use a software breakpoint
> >      to emulate single-step.
> >   2) Since gdb gets a wrong initial value of PC, then gdb inserts a
> >      breakpoint to wrong place (PC+4).
> > 
> > Since we do support ‘vContSupported’ query command, so let's tell gdb that
> > we support it.
> > 
> > Before this change, gdb send below 'Z0' packet to implement single-step:
> > gdb_handle_packet: Z0,4,4
> > 
> > After this change, gdb send "vCont;s.." which is expected:
> > gdb_handle_packet: vCont?
> > put_packet: vCont;c;C;s;S
> > gdb_handle_packet: vCont;s:p1.1;c:p1.-1
> 
> You actually fixed this for all architectures :)
> 
> This has been annoying me on MIPS since more than a year...
> 
> I haven't checked the GDB protocol spec, but so far:
> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>
Thanks for your feedback. :)

Changbin Du Feb. 21, 2020, 11:51 a.m. UTC | #8

On Thu, Feb 20, 2020 at 10:24:37PM +0100, Luc Michel wrote:
> I'm curious, I never experienced this behaviour from GDB. What GDB and
> QEMU versions are you using?
> 
> On my side (GDB 9.1), even without 'vContSupported+' in the 'qSupported'
> answer, GDB sends a 'vCont?' packet on the first stepi:
> 
> 0x00000000 in ?? ()
> (gdb) si
> Sending packet: $m0,4#fd...Ack
> Packet received: 00000000
> Sending packet: $vCont?#49...Ack
> Packet received: vCont;c;C;s;S
> Packet vCont (verbose-resume) is supported
> Sending packet: $vCont;s:p1.1;c:p1.-1#f7...Ack
> Packet received: T05thread:p01.01;
> 
> Your second issue (wrong PC value) should be investigated though. Does
> it happen on QEMU vanilla? Do you have a way to reproduce this bug?
> 
Just confirmed this issue. This is an endianness problem for gdb. I was
debugging an big-endian elf and my host cpu is little-endian. QEMU gdbstub
always uses host cpu endian but gdb client treats it as big-endian by
inspecting elf info.

I can mannually set it to little-endian but it is painful. The gdb complains
abount invalid opcode error in debuginfo.

I also noticed that someoneelse has already tried to resolve this issue.
https://patchwork.kernel.org/patch/9528947/

> Anyway after re-reading the GDB remote protocol documentation, I think
> your patch is right, the feature should be advertised.
> 
> However I think your commit message needs some modifications. This fix
> is not specific to ARM or TCG, but to the gdbstub itself. You also
> mention this bug you have with PC, which is not related to the bug you
> are fixing here. Could you rewrite it in a more generic way? You simply
> need to emphasis the effect of advertising the 'vContSupported+' feature
> on GDB.
> 
> Thanks.
> 
> -- 
> Luc

Philippe Mathieu-Daudé Feb. 21, 2020, 12:33 p.m. UTC | #9

On 2/21/20 12:51 PM, Changbin Du wrote:
> On Thu, Feb 20, 2020 at 10:24:37PM +0100, Luc Michel wrote:
>> I'm curious, I never experienced this behaviour from GDB. What GDB and
>> QEMU versions are you using?
>>
>> On my side (GDB 9.1), even without 'vContSupported+' in the 'qSupported'
>> answer, GDB sends a 'vCont?' packet on the first stepi:
>>
>> 0x00000000 in ?? ()
>> (gdb) si
>> Sending packet: $m0,4#fd...Ack
>> Packet received: 00000000
>> Sending packet: $vCont?#49...Ack
>> Packet received: vCont;c;C;s;S
>> Packet vCont (verbose-resume) is supported
>> Sending packet: $vCont;s:p1.1;c:p1.-1#f7...Ack
>> Packet received: T05thread:p01.01;
>>
>> Your second issue (wrong PC value) should be investigated though. Does
>> it happen on QEMU vanilla? Do you have a way to reproduce this bug?
>>
> Just confirmed this issue. This is an endianness problem for gdb. I was
> debugging an big-endian elf and my host cpu is little-endian. QEMU gdbstub
> always uses host cpu endian but gdb client treats it as big-endian by
> inspecting elf info.

I'm using Debian gdb-multiarch, and indeed use cross-endianess (I always 
set arch/endian explicitly). This might be why I hit this too.

> 
> I can mannually set it to little-endian but it is painful. The gdb complains
> abount invalid opcode error in debuginfo.
> 
> I also noticed that someoneelse has already tried to resolve this issue.
> https://patchwork.kernel.org/patch/9528947/
> 
>> Anyway after re-reading the GDB remote protocol documentation, I think
>> your patch is right, the feature should be advertised.
>>
>> However I think your commit message needs some modifications. This fix
>> is not specific to ARM or TCG, but to the gdbstub itself. You also
>> mention this bug you have with PC, which is not related to the bug you
>> are fixing here. Could you rewrite it in a more generic way? You simply
>> need to emphasis the effect of advertising the 'vContSupported+' feature
>> on GDB.
>>
>> Thanks.
>>
>> -- 
>> Luc
>

tcg: gdbstub: Fix single-step issue on arm target

Commit Message

Comments

Patch