Semihost SYS_READC implementation (v4)

Message ID	20191024224622.12371-1-keithp@keithp.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=lXeL=YR=nongnu.org=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0D2F52084C From: Keith Packard <keithp@keithp.com> To: qemu-devel@nongnu.org Subject: [PATCH] Semihost SYS_READC implementation (v4) Date: Thu, 24 Oct 2019 15:46:22 -0700 Message-Id: <20191024224622.12371-1-keithp@keithp.com> In-Reply-To: <89ada4b1-ee3d-a512-07c2-9bc1ba5806da@redhat.com> References: <89ada4b1-ee3d-a512-07c2-9bc1ba5806da@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: list Cc: Keith Packard <keithp@keithp.com> Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>
Series	Semihost SYS_READC implementation (v4) \| expand Semihost SYS_READC implementation (v4)

Keith Packard Oct. 24, 2019, 10:46 p.m. UTC

Provides a blocking call to read a character from the console using
semihosting.chardev, if specified. This takes some careful command
line options to use stdio successfully as the serial ports, monitor
and semihost all want to use stdio. Here's a sample set of command
line options which share stdio betwen semihost, monitor and serial
ports:

	qemu \
	-chardev stdio,mux=on,id=stdio0 \
	-serial chardev:stdio0 \
	-semihosting-config enable=on,chardev=stdio0 \
	-mon chardev=stdio0,mode=readline

This creates a chardev hooked to stdio and then connects all of the
subsystems to it. A shorter mechanism would be good to hear about.

v2:
	Add implementation in linux-user/arm/semihost.c

v3:  (thanks to Paolo Bonzini <pbonzini@redhat.com>)
	Replace hand-rolled fifo with fifo8
	Avoid mixing code and declarations
	Remove spurious (void) cast of function parameters
	Define qemu_semihosting_console_init when CONFIG_USER_ONLY

v4:
	Add qemu_semihosting_console_init to stubs/semihost.c for
	hosts that don't support semihosting

Signed-off-by: Keith Packard <keithp@keithp.com>
---
 hw/semihosting/console.c          | 73 +++++++++++++++++++++++++++++++
 include/hw/semihosting/console.h  | 12 +++++
 include/hw/semihosting/semihost.h |  4 ++
 linux-user/arm/semihost.c         | 24 ++++++++++
 target/arm/arm-semi.c             |  3 +-
 vl.c                              |  3 ++
 6 files changed, 117 insertions(+), 2 deletions(-)

Alex Bennée Oct. 25, 2019, 9:51 a.m. UTC | #1

Keith Packard <keithp@keithp.com> writes:

> Provides a blocking call to read a character from the console using
> semihosting.chardev, if specified. This takes some careful command
> line options to use stdio successfully as the serial ports, monitor
> and semihost all want to use stdio. Here's a sample set of command
> line options which share stdio betwen semihost, monitor and serial
> ports:
>
> 	qemu \
> 	-chardev stdio,mux=on,id=stdio0 \
> 	-serial chardev:stdio0 \
> 	-semihosting-config enable=on,chardev=stdio0 \
> 	-mon chardev=stdio0,mode=readline

I can see the use for this but I'd like to know what you are testing
with. We only have very basic smoketests in check-tcg but I've tested
with the latest arm-semihosting tests and they are all fine so no
regressions there.

>
> This creates a chardev hooked to stdio and then connects all of the
> subsystems to it. A shorter mechanism would be good to hear about.

Please keep version history bellow --- so they get dropped when the
patch is applied.

>
> v2:
> 	Add implementation in linux-user/arm/semihost.c
>
> v3:  (thanks to Paolo Bonzini <pbonzini@redhat.com>)
> 	Replace hand-rolled fifo with fifo8
> 	Avoid mixing code and declarations
> 	Remove spurious (void) cast of function parameters
> 	Define qemu_semihosting_console_init when CONFIG_USER_ONLY
>
> v4:
> 	Add qemu_semihosting_console_init to stubs/semihost.c for
> 	hosts that don't support semihosting

This doesn't appear to be in the diff which is why I'm seeing a compile
failure for non-CONFIG_SEMIHOST machines. However...

>
> Signed-off-by: Keith Packard <keithp@keithp.com>
> ---
>  hw/semihosting/console.c          | 73 +++++++++++++++++++++++++++++++
>  include/hw/semihosting/console.h  | 12 +++++
>  include/hw/semihosting/semihost.h |  4 ++
>  linux-user/arm/semihost.c         | 24 ++++++++++
>  target/arm/arm-semi.c             |  3 +-
>  vl.c                              |  3 ++
>  6 files changed, 117 insertions(+), 2 deletions(-)
>
> diff --git a/hw/semihosting/console.c b/hw/semihosting/console.c
> index b4b17c8afb..197bff079b 100644
> --- a/hw/semihosting/console.c
> +++ b/hw/semihosting/console.c
> @@ -98,3 +98,76 @@ void qemu_semihosting_console_outc(CPUArchState *env, target_ulong addr)
>                        __func__, addr);
>      }
>  }
> +
> +#include <pthread.h>
> +#include "chardev/char-fe.h"
> +#include "sysemu/sysemu.h"
> +#include "qemu/main-loop.h"
> +#include "qapi/error.h"
> +#include "qemu/fifo8.h"
> +
> +#define FIFO_SIZE   1024
> +
> +typedef struct SemihostingConsole {
> +    CharBackend         backend;
> +    pthread_mutex_t     mutex;
> +    pthread_cond_t      cond;
> +    bool                got;
> +    Fifo8               fifo;
> +} SemihostingConsole;
> +
> +static SemihostingConsole console = {
> +    .mutex = PTHREAD_MUTEX_INITIALIZER,
> +    .cond = PTHREAD_COND_INITIALIZER
> +};
> +
> +static int console_can_read(void *opaque)
> +{
> +    SemihostingConsole *c = opaque;
> +    int ret;
> +    pthread_mutex_lock(&c->mutex);
> +    ret = (int) fifo8_num_free(&c->fifo);
> +    pthread_mutex_unlock(&c->mutex);
> +    return ret;
> +}
> +
> +static void console_read(void *opaque, const uint8_t *buf, int size)
> +{
> +    SemihostingConsole *c = opaque;
> +    pthread_mutex_lock(&c->mutex);
> +    while (size-- && !fifo8_is_full(&c->fifo)) {
> +        fifo8_push(&c->fifo, *buf++);
> +    }
> +    pthread_cond_broadcast(&c->cond);
> +    pthread_mutex_unlock(&c->mutex);
> +}
> +
> +target_ulong qemu_semihosting_console_inc(CPUArchState *env)
> +{
> +    uint8_t ch;
> +    SemihostingConsole *c = &console;
> +    qemu_mutex_unlock_iothread();
> +    pthread_mutex_lock(&c->mutex);
> +    while (fifo8_is_empty(&c->fifo)) {
> +        pthread_cond_wait(&c->cond, &c->mutex);
> +    }
> +    ch = fifo8_pop(&c->fifo);
> +    pthread_mutex_unlock(&c->mutex);
> +    qemu_mutex_lock_iothread();
> +    return (target_ulong) ch;
> +}
> +
> +void qemu_semihosting_console_init(void)
> +{
> +    Chardev *chr = semihosting_get_chardev();
> +
> +    if  (chr) {
> +        fifo8_create(&console.fifo, FIFO_SIZE);
> +        qemu_chr_fe_init(&console.backend, chr, &error_abort);
> +        qemu_chr_fe_set_handlers(&console.backend,
> +                                 console_can_read,
> +                                 console_read,
> +                                 NULL, NULL, &console,
> +                                 NULL, true);
> +    }
> +}
> diff --git a/include/hw/semihosting/console.h b/include/hw/semihosting/console.h
> index 9be9754bcd..f7d5905b41 100644
> --- a/include/hw/semihosting/console.h
> +++ b/include/hw/semihosting/console.h
> @@ -37,6 +37,18 @@ int qemu_semihosting_console_outs(CPUArchState *env, target_ulong s);
>   */
>  void qemu_semihosting_console_outc(CPUArchState *env, target_ulong c);
>
> +/**
> + * qemu_semihosting_console_inc:
> + * @env: CPUArchState
> + *
> + * Receive single character from debug console. This
> + * may be the remote gdb session if a softmmu guest is currently being
> + * debugged.
> + *
> + * Returns: character read or -1 on error
> + */
> +target_ulong qemu_semihosting_console_inc(CPUArchState *env);
> +
>  /**
>   * qemu_semihosting_log_out:
>   * @s: pointer to string
> diff --git a/include/hw/semihosting/semihost.h b/include/hw/semihosting/semihost.h
> index 60fc42d851..b8ce5117ae 100644
> --- a/include/hw/semihosting/semihost.h
> +++ b/include/hw/semihosting/semihost.h
> @@ -56,6 +56,9 @@ static inline Chardev *semihosting_get_chardev(void)
>  {
>      return NULL;
>  }
> +static inline void qemu_semihosting_console_init(void)
> +{
> +}
>  #else /* !CONFIG_USER_ONLY */
>  bool semihosting_enabled(void);
>  SemihostingTarget semihosting_get_target(void);
> @@ -68,6 +71,7 @@ Chardev *semihosting_get_chardev(void);
>  void qemu_semihosting_enable(void);
>  int qemu_semihosting_config_options(const char *opt);
>  void qemu_semihosting_connect_chardevs(void);
> +void qemu_semihosting_console_init(void);
>  #endif /* CONFIG_USER_ONLY */
>
>  #endif /* SEMIHOST_H */
> diff --git a/linux-user/arm/semihost.c b/linux-user/arm/semihost.c
> index a16b525eec..13a097515b 100644
> --- a/linux-user/arm/semihost.c
> +++ b/linux-user/arm/semihost.c
> @@ -47,3 +47,27 @@ void qemu_semihosting_console_outc(CPUArchState *env, target_ulong addr)
>          }
>      }
>  }
> +
> +#include <poll.h>

Headers should go at the top...I was about to discuss the usage of
poll() but I realise we are in linux-user here so non-POSIX portability
isn't an issue.

> +
> +target_ulong qemu_semihosting_console_inc(CPUArchState *env)
> +{
> +    uint8_t c;
> +    struct pollfd pollfd = {
> +        .fd = STDIN_FILENO,
> +        .events = POLLIN
> +    };
> +
> +    if (poll(&pollfd, 1, -1) != 1) {
> +        qemu_log_mask(LOG_UNIMP, "%s: unexpected read from stdin failure",
> +                      __func__);
> +        return (target_ulong) -1;
> +    }
> +
> +    if (read(STDIN_FILENO, &c, 1) != 1) {
> +        qemu_log_mask(LOG_UNIMP, "%s: unexpected read from stdin failure",
> +                      __func__);
> +        return (target_ulong) -1;
> +    }
> +    return (target_ulong) c;
> +}
> diff --git a/target/arm/arm-semi.c b/target/arm/arm-semi.c
> index 6f7b6d801b..47d61f6fe1 100644
> --- a/target/arm/arm-semi.c
> +++ b/target/arm/arm-semi.c
> @@ -802,8 +802,7 @@ target_ulong do_arm_semihosting(CPUARMState *env)
>
>          return guestfd_fns[gf->type].readfn(cpu, gf, arg1, len);
>      case TARGET_SYS_READC:
> -        qemu_log_mask(LOG_UNIMP, "%s: SYS_READC not implemented", __func__);
> -        return 0;
> +        return qemu_semihosting_console_inc(env);

I'm not sure this would be correct if there was no character available.
The docs imply it blocks although don't say so explicitly AFAICT.

>      case TARGET_SYS_ISTTY:
>          GET_ARG(0);
>
> diff --git a/vl.c b/vl.c
> index 4489cfb2bb..ac584d97ea 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -4381,6 +4381,9 @@ int main(int argc, char **argv, char **envp)
>      ds = init_displaystate();
>      qemu_display_init(ds, &dpy);
>
> +    /* connect semihosting console input if requested */
> +    qemu_semihosting_console_init();
> +

I'd rather rename qemu_semihosting_connect_chardevs to
qemu_semihosting_init and keep all our bits of semihosting setup in
there rather than having multiple calls out of vl.c

>      /* must be after terminal init, SDL library changes signal handlers */
>      os_setup_signal_handling();


--
Alex Bennée

Keith Packard Oct. 25, 2019, 4:36 p.m. UTC | #2

Alex Bennée <alex.bennee@linaro.org> writes:

> I can see the use for this but I'd like to know what you are testing
> with. We only have very basic smoketests in check-tcg but I've tested
> with the latest arm-semihosting tests and they are all fine so no
> regressions there.

I'm adding semihosting support to picolibc
(https://keithp.com/picolibc/) and need a way to automate tests for it's
SYS_READC support, so eventually I'll have automated tests there, but
that depends on qemu support...

> Please keep version history bellow --- so they get dropped when the
> patch is applied.

Sure, I'll edit the mail before sending. In my repo, I'm leaving the
version history in git so I can keep track of it.

>> v4:
>> 	Add qemu_semihosting_console_init to stubs/semihost.c for
>> 	hosts that don't support semihosting
>
> This doesn't appear to be in the diff which is why I'm seeing a compile
> failure for non-CONFIG_SEMIHOST machines. However...

Argh! Just git operation failure -- I'm building another patch on top of
this (for RISC-V semihosting support) and the stubs/semihost.c change
got caught in there. My overall check for changes missed this mistake.

>> +
>> +#include <poll.h>
>
> Headers should go at the top.

Thanks; I've fixed this file and hw/semihosting/console.c

>>      case TARGET_SYS_READC:
>> -        qemu_log_mask(LOG_UNIMP, "%s: SYS_READC not implemented", __func__);
>> -        return 0;
>> +        return qemu_semihosting_console_inc(env);
>
> I'm not sure this would be correct if there was no character available.
> The docs imply it blocks although don't say so explicitly AFAICT.

Here's what the docs say:

        https://static.docs.arm.com/100863/0200/semihosting.pdf

        Return

        On exit, the RETURN REGISTER contains the byte read from the console.

If this call didn't block, they'd have to define a value which indicated
that no byte was available? Openocd also implements SYS_READC using
'getchar()', which definitely blocks. So, at least qemu would be the
same.

I realize it's really weird to block the whole emulation for this call,
but given the target environment (deeply embedded devices), it's quite
convenient as the whole qemu process blocks, instead of spinning.

>> +    /* connect semihosting console input if requested */
>> +    qemu_semihosting_console_init();
>> +
>
> I'd rather rename qemu_semihosting_connect_chardevs to
> qemu_semihosting_init and keep all our bits of semihosting setup in
> there rather than having multiple calls out of vl.c

I also thought this would be a nice cleanup. However, the last caller to
qemu_chr_fe_set_handlers gets the focus for input, and connect_chardevs
is called before the serial ports and monitor are initialized, so
semihosting gets pushed aside and stdio input ends up hooked to the monitor
instead.

Adding this function and placing the call after the other stdio users
get hooked up means that semihosting starts with the input focus.

Peter Maydell Oct. 25, 2019, 4:49 p.m. UTC | #3

On Fri, 25 Oct 2019 at 17:40, Keith Packard <keithp@keithp.com> wrote:
>
> Alex Bennée <alex.bennee@linaro.org> writes:
>
> > I can see the use for this but I'd like to know what you are testing
> > with. We only have very basic smoketests in check-tcg but I've tested
> > with the latest arm-semihosting tests and they are all fine so no
> > regressions there.
>
> I'm adding semihosting support to picolibc
> (https://keithp.com/picolibc/) and need a way to automate tests for it's
> SYS_READC support, so eventually I'll have automated tests there, but
> that depends on qemu support...

I have a cheesy-but-functional set of test code
https://git.linaro.org/people/peter.maydell/semihosting-tests.git/
here, fwiw.

> Argh! Just git operation failure -- I'm building another patch on top of
> this (for RISC-V semihosting support) and the stubs/semihost.c change
> got caught in there. My overall check for changes missed this mistake.

Is there a specification for RISC-V semihosting? This is
likely to be my first question when the support comes
round for review, so you can have it early :-)  We'd
prefer to implement specified interfaces, not random
ad-hoc "this seems to be what newlib wants to see,
which is turn got hacked together by copying some other
architecture's code".

> Here's what the docs say:
>
>         https://static.docs.arm.com/100863/0200/semihosting.pdf
>
>         Return
>
>         On exit, the RETURN REGISTER contains the byte read from the console.
>
> If this call didn't block, they'd have to define a value which indicated
> that no byte was available? Openocd also implements SYS_READC using
> 'getchar()', which definitely blocks. So, at least qemu would be the
> same.
>
> I realize it's really weird to block the whole emulation for this call,
> but given the target environment (deeply embedded devices), it's quite
> convenient as the whole qemu process blocks, instead of spinning.

Yeah, the docs could perhaps spell it out more clearly, but
the intention is that the call blocks.

It's possible if necessary to implement this so it doesn't
block the entire simulation: you just have the QEMU implementation
do a block-until-character-or-timeout operation, and if it times
out then you rewind the guest PC to point at the BRK/HLT insn that
triggered the semihosting call, and resume simulation. That would
give an opportunity for things like interrupts to fire before going
back into waiting for a character. This feels to me like it's a bit
overcomplicated unless it turns out we actually require it though.

> I also thought this would be a nice cleanup. However, the last caller to
> qemu_chr_fe_set_handlers gets the focus for input, and connect_chardevs
> is called before the serial ports and monitor are initialized, so
> semihosting gets pushed aside and stdio input ends up hooked to the monitor
> instead.

Isn't the answer to this "don't use a command line that tries
to connect stdio to multiple things" ?

thanks
-- PMM

Alex Bennée Oct. 25, 2019, 5:02 p.m. UTC | #4

Keith Packard <keithp@keithp.com> writes:

> Alex Bennée <alex.bennee@linaro.org> writes:
>
<snip>
>> Please keep version history bellow --- so they get dropped when the
>> patch is applied.
>
> Sure, I'll edit the mail before sending. In my repo, I'm leaving the
> version history in git so I can keep track of it.

It's OK to keep the history in git, I do it all the time. But by using a
--- divider in your commit message the git-am tool will trim it off when
it gets applied from a mailing list. There is no need to manually edit
your patches as that would be error prone.

To pick a random example of what I mean from my tree:

  https://github.com/stsquad/qemu/commit/1b648ac695d45a4e4d72cf64a97346d9695b863b

--
Alex Bennée

no-reply@patchew.org Oct. 25, 2019, 6:17 p.m. UTC | #5

Patchew URL: https://patchew.org/QEMU/20191024224622.12371-1-keithp@keithp.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC      aarch64-softmmu/trace/generated-helpers.o
../vl.o: In function `main':
/tmp/qemu-test/src/vl.c:4385: undefined reference to `qemu_semihosting_console_init'
collect2: error: ld returned 1 exit status
make[1]: *** [qemu-system-x86_64] Error 1
make: *** [x86_64-softmmu/all] Error 2
make: *** Waiting for unfinished jobs....
  LINK    aarch64-softmmu/qemu-system-aarch64
Traceback (most recent call last):
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=56d753bff38f4c6794bd90f9a5f3e2df', '-u', '1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-kcdksyw0/src/docker-src.2019-10-25-14.14.31.23494:/var/tmp/qemu:z,ro', 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=56d753bff38f4c6794bd90f9a5f3e2df
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-kcdksyw0/src'
make: *** [docker-run-test-quick@centos7] Error 2

real    2m45.581s
user    0m8.220s


The full log is available at
http://patchew.org/logs/20191024224622.12371-1-keithp@keithp.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

no-reply@patchew.org Oct. 25, 2019, 6:20 p.m. UTC | #6

Patchew URL: https://patchew.org/QEMU/20191024224622.12371-1-keithp@keithp.com/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#! /bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-mingw@fedora J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC      aarch64-softmmu/target/arm/translate-sve.o
../vl.o: In function `qemu_main':
/tmp/qemu-test/src/vl.c:4385: undefined reference to `qemu_semihosting_console_init'
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:206: qemu-system-x86_64w.exe] Error 1
make: *** [Makefile:482: x86_64-softmmu/all] Error 2
make: *** Waiting for unfinished jobs....
  LINK    aarch64-softmmu/qemu-system-aarch64w.exe
  GEN     aarch64-softmmu/qemu-system-aarch64.exe
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=c8023af7d17f49c389a9dbb7c2292a6e', '-u', '1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-wurz4hht/src/docker-src.2019-10-25-14.17.51.2074:/var/tmp/qemu:z,ro', 'qemu:fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=c8023af7d17f49c389a9dbb7c2292a6e
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-wurz4hht/src'
make: *** [docker-run-test-mingw@fedora] Error 2

real    2m52.599s
user    0m8.472s


The full log is available at
http://patchew.org/logs/20191024224622.12371-1-keithp@keithp.com/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

Keith Packard Oct. 25, 2019, 7:15 p.m. UTC | #7

Peter Maydell <peter.maydell@linaro.org> writes:

> Is there a specification for RISC-V semihosting? This is
> likely to be my first question when the support comes
> round for review, so you can have it early :-)  We'd
> prefer to implement specified interfaces, not random
> ad-hoc "this seems to be what newlib wants to see,
> which is turn got hacked together by copying some other
> architecture's code".

There seems to be convergence on a pretty simple interface which uses
ebreak surrounded by a couple of specific no-ops:

      slli x0, x0, 0x1f
      ebreak
      srai x0, x0, 0x7

There are implementations in rust and openocd, and I've got one for
picolibc. The risc-v semihosting code is sitting on a branch in my repo
on github:

        https://github.com/keith-packard/qemu/tree/riscv-semihost

> (describing a mechanism to avoid stopping the emulator)
> This feels to me like it's a bit overcomplicated unless it turns out
> we actually require it though.

Would also be nice for multi-core setups. I'd like to start with the
simple plan for now.

> Isn't the answer to this "don't use a command line that tries
> to connect stdio to multiple things" ?

Uh, we do that all the time? The mux device is designed to handle this
so that you can use stdio for both monitor commands and application
I/O. It's very convenient, the only issue is that the last device that
hooks to the mux ends up getting input first (you use ^Ac to rotate
among the selected devices).

Peter Maydell Oct. 25, 2019, 8:53 p.m. UTC | #8

On Fri, 25 Oct 2019 at 20:15, Keith Packard <keithp@keithp.com> wrote:
>
> Peter Maydell <peter.maydell@linaro.org> writes:
>
> > Is there a specification for RISC-V semihosting? This is
> > likely to be my first question when the support comes
> > round for review, so you can have it early :-)  We'd
> > prefer to implement specified interfaces, not random
> > ad-hoc "this seems to be what newlib wants to see,
> > which is turn got hacked together by copying some other
> > architecture's code".
>
> There seems to be convergence on a pretty simple interface which uses
> ebreak surrounded by a couple of specific no-ops:
>
>       slli x0, x0, 0x1f
>       ebreak
>       srai x0, x0, 0x7
>
> There are implementations in rust and openocd, and I've got one for
> picolibc.

I'm going to push for somebody actually writing out a
document and putting it somewhere that we can point to
and say "that's the authoritative spec", please...
it doesn't have to be a big formal thing, but I do
think you want it written down, because the whole point
is for multiple implementations and users to interoperate.

> > Isn't the answer to this "don't use a command line that tries
> > to connect stdio to multiple things" ?
>
> Uh, we do that all the time? The mux device is designed to handle this
> so that you can use stdio for both monitor commands and application
> I/O. It's very convenient, the only issue is that the last device that
> hooks to the mux ends up getting input first (you use ^Ac to rotate
> among the selected devices).

Yeah, the mux works fine for this kind of thing. There's
no inherent reason why semihosting ought to "win" as
the initially selected thing on the mux, though --
typically that would be expected to be the UART/serial
console.

thanks
-- PMM

Keith Packard Oct. 25, 2019, 11:18 p.m. UTC | #9

Peter Maydell <peter.maydell@linaro.org> writes:

> I'm going to push for somebody actually writing out a
> document and putting it somewhere that we can point to
> and say "that's the authoritative spec", please...
> it doesn't have to be a big formal thing, but I do
> think you want it written down, because the whole point
> is for multiple implementations and users to interoperate.

I can work within the RISC-V foundation to get an 'official' document
written. Having a handful of existing inter-operable implementations
will make that really easy to do :-)

> Yeah, the mux works fine for this kind of thing. There's
> no inherent reason why semihosting ought to "win" as
> the initially selected thing on the mux, though --
> typically that would be expected to be the UART/serial
> console.

That would just require moving the call to qemu_semihosting_console_init
up in the function. Doesn't really matter to me; I suspect that most
users will either user serial or semihosting, but probably not both
(except when debugging the serial driver).

This does the trick (on top of the latest patch). Let me know if this is
what you want. To get semihosting to be first, you have to disable the
serial driver if the hardware has a serial port:

        qemu -chardev stdio,mux=on,id=stdio0 \
             -serial null \
             -semihosting-config enable=on,chardev=stdio0 \
             -mon chardev=stdio0,mode=readline"

diff --git a/vl.c b/vl.c
index ac584d97ea..7ea8a907fd 100644
--- a/vl.c
+++ b/vl.c
@@ -4284,6 +4284,9 @@ int main(int argc, char **argv, char **envp)
     qemu_opts_foreach(qemu_find_opts("mon"),
                       mon_init_func, NULL, &error_fatal);
 
+    /* connect semihosting console input if requested */
+    qemu_semihosting_console_init();
+
     if (foreach_device_config(DEV_SERIAL, serial_parse) < 0)
         exit(1);
     if (foreach_device_config(DEV_PARALLEL, parallel_parse) < 0)
@@ -4381,9 +4384,6 @@ int main(int argc, char **argv, char **envp)
     ds = init_displaystate();
     qemu_display_init(ds, &dpy);
 
-    /* connect semihosting console input if requested */
-    qemu_semihosting_console_init();
-
     /* must be after terminal init, SDL library changes signal handlers */
     os_setup_signal_handling();

Keith Packard Nov. 5, 2019, 5:10 a.m. UTC | #10

Peter Maydell <peter.maydell@linaro.org> writes:

> I'm going to push for somebody actually writing out a
> document and putting it somewhere that we can point to
> and say "that's the authoritative spec", please...
> it doesn't have to be a big formal thing, but I do
> think you want it written down, because the whole point
> is for multiple implementations and users to interoperate.

That happened in June -- I was just looking at the wrong version of the
spec. In the current version, which can be found here:

        https://riscv.org/specifications/

                   The RISC-V Instruction Set Manual
                       Volume I: Unprivileged ISA
                Document Version 20190608-Base-Ratified
                                    
Section 2.8 says:

        Another use of EBREAK is to support “semihosting”, where the
        execution environment includes a debugger that can provide
        services over an alternate system call interface built around
        the EBREAK instruction.  Because the RISC-V base ISA does not
        provide more than one EBREAK instruction, RISC-V semihosting
        uses a special sequence of instructions to distinguish a
        semihosting EBREAK from a debugger inserted EBREAK.

                slli x0, x0, 0x1f   # Entry NOP
                ebreak              # Break to debugger
                srai x0, x0, 7      # NOP encoding the semihosting call number 7

        Note that these three instructions must be 32-bit-wide
        instructions, i.e., they mustn’t be among the compressed 16-bit
        instructions described in Chapter 16.

        The shift NOP instructions are still considered available for
        use as HINTS.

        Semihosting is a form of service call and would be more
        naturally encoded as an ECALL using an existing ABI, but this
        would require the debugger to be able to intercept ECALLs, which
        is a newer addition to the debug standard.  We intend to move
        over to using ECALLs with a standard ABI, in which case,
        semihosting can share a service ABI with an existing standard.

        We note that ARM processors have also moved to using SVC instead
        of BKPT for semihosting calls in newer designs.

I'd like to get the READC patch landed and then post the RISC-V patch
afterwards as the RISC-V patch currently includes READC support.

Peter Maydell Nov. 11, 2019, 2:51 p.m. UTC | #11

On Tue, 5 Nov 2019 at 05:10, Keith Packard <keithp@keithp.com> wrote:
>
> Peter Maydell <peter.maydell@linaro.org> writes:
>
> > I'm going to push for somebody actually writing out a
> > document and putting it somewhere that we can point to
> > and say "that's the authoritative spec", please...
> > it doesn't have to be a big formal thing, but I do
> > think you want it written down, because the whole point
> > is for multiple implementations and users to interoperate.
>
> That happened in June -- I was just looking at the wrong version of the
> spec. In the current version, which can be found here:
>
>         https://riscv.org/specifications/
>
>                    The RISC-V Instruction Set Manual
>                        Volume I: Unprivileged ISA
>                 Document Version 20190608-Base-Ratified
>
> Section 2.8 says:
>
>         Another use of EBREAK is to support “semihosting”, where the
>         execution environment includes a debugger that can provide
>         services over an alternate system call interface built around
>         the EBREAK instruction.  Because the RISC-V base ISA does not
>         provide more than one EBREAK instruction, RISC-V semihosting
>         uses a special sequence of instructions to distinguish a
>         semihosting EBREAK from a debugger inserted EBREAK.
>
>                 slli x0, x0, 0x1f   # Entry NOP
>                 ebreak              # Break to debugger
>                 srai x0, x0, 7      # NOP encoding the semihosting call number 7
>
>         Note that these three instructions must be 32-bit-wide
>         instructions, i.e., they mustn’t be among the compressed 16-bit
>         instructions described in Chapter 16.
>
>         The shift NOP instructions are still considered available for
>         use as HINTS.
>
>         Semihosting is a form of service call and would be more
>         naturally encoded as an ECALL using an existing ABI, but this
>         would require the debugger to be able to intercept ECALLs, which
>         is a newer addition to the debug standard.  We intend to move
>         over to using ECALLs with a standard ABI, in which case,
>         semihosting can share a service ABI with an existing standard.
>
>         We note that ARM processors have also moved to using SVC instead
>         of BKPT for semihosting calls in newer designs.

That defines the instruction sequence used to make a semihosting
call, but not the specification of what the calls are:
 * what call numbers perform which functions
 * how arguments are passed to the call (registers? parameter
   blocks in memory? other?)
 * the semantics of each function supported (number of arguments,
   behaviour, error handling)

That's really what I had in mind by the overall semihosting spec.

PS: the parenthetical about ARM semihosting at the bottom  of
the text you quote is wrong, incidentally. The traditional insn
for semihosting on A-profile devices has always been SWI/SVC; it
is BKPT only on M-profile devices; and the latest revision of the
semihosting spec recommends the HLT instruction for both A- and M-.

thanks
-- PMM

Alistair Francis Nov. 14, 2019, 3:46 p.m. UTC | #12

On Mon, Nov 11, 2019 at 6:51 AM Peter Maydell <peter.maydell@linaro.org> wrote:
>
> On Tue, 5 Nov 2019 at 05:10, Keith Packard <keithp@keithp.com> wrote:
> >
> > Peter Maydell <peter.maydell@linaro.org> writes:
> >
> > > I'm going to push for somebody actually writing out a
> > > document and putting it somewhere that we can point to
> > > and say "that's the authoritative spec", please...
> > > it doesn't have to be a big formal thing, but I do
> > > think you want it written down, because the whole point
> > > is for multiple implementations and users to interoperate.
> >
> > That happened in June -- I was just looking at the wrong version of the
> > spec. In the current version, which can be found here:
> >
> >         https://riscv.org/specifications/
> >
> >                    The RISC-V Instruction Set Manual
> >                        Volume I: Unprivileged ISA
> >                 Document Version 20190608-Base-Ratified
> >
> > Section 2.8 says:
> >
> >         Another use of EBREAK is to support “semihosting”, where the
> >         execution environment includes a debugger that can provide
> >         services over an alternate system call interface built around
> >         the EBREAK instruction.  Because the RISC-V base ISA does not
> >         provide more than one EBREAK instruction, RISC-V semihosting
> >         uses a special sequence of instructions to distinguish a
> >         semihosting EBREAK from a debugger inserted EBREAK.
> >
> >                 slli x0, x0, 0x1f   # Entry NOP
> >                 ebreak              # Break to debugger
> >                 srai x0, x0, 7      # NOP encoding the semihosting call number 7
> >
> >         Note that these three instructions must be 32-bit-wide
> >         instructions, i.e., they mustn’t be among the compressed 16-bit
> >         instructions described in Chapter 16.
> >
> >         The shift NOP instructions are still considered available for
> >         use as HINTS.
> >
> >         Semihosting is a form of service call and would be more
> >         naturally encoded as an ECALL using an existing ABI, but this
> >         would require the debugger to be able to intercept ECALLs, which
> >         is a newer addition to the debug standard.  We intend to move
> >         over to using ECALLs with a standard ABI, in which case,
> >         semihosting can share a service ABI with an existing standard.
> >
> >         We note that ARM processors have also moved to using SVC instead
> >         of BKPT for semihosting calls in newer designs.
>
> That defines the instruction sequence used to make a semihosting
> call, but not the specification of what the calls are:
>  * what call numbers perform which functions
>  * how arguments are passed to the call (registers? parameter
>    blocks in memory? other?)
>  * the semantics of each function supported (number of arguments,
>    behaviour, error handling)
>
> That's really what I had in mind by the overall semihosting spec.

This sounds like something that the platform spec should contain.

+Atish +Palmer who are in charge of that.

@Keith maybe you should contact the platform spec group and ask them
to look into this?

Alistair

>
> PS: the parenthetical about ARM semihosting at the bottom  of
> the text you quote is wrong, incidentally. The traditional insn
> for semihosting on A-profile devices has always been SWI/SVC; it
> is BKPT only on M-profile devices; and the latest revision of the
> semihosting spec recommends the HLT instruction for both A- and M-.
>
> thanks
> -- PMM
>

Peter Maydell Nov. 14, 2019, 4:14 p.m. UTC | #13

On Fri, 25 Oct 2019 at 20:15, Keith Packard <keithp@keithp.com> wrote:
> There seems to be convergence on a pretty simple interface which uses
> ebreak surrounded by a couple of specific no-ops:
>
>       slli x0, x0, 0x1f
>       ebreak
>       srai x0, x0, 0x7
>
> There are implementations in rust and openocd, and I've got one for
> picolibc. The risc-v semihosting code is sitting on a branch in my repo
> on github:
>
>         https://github.com/keith-packard/qemu/tree/riscv-semihost

I had an idle glance at this implementation, and this:

   uint32_t pre = opcode_at(&ctx->base, ctx->base.pc_next - 4);
   uint32_t ebreak = opcode_at(&ctx->base, ctx->base.pc_next);
   uint32_t post = opcode_at(&ctx->base, ctx->base.pc_next + 4);

(where opcode_at() is a wrapper for cpu_ldl_code()) has
some unfortunate side effects: if the previous instruction
is in the previous MMU page, or the following instruction
is in the next MMU page, you might incorrectly trigger
an exception (where QEMU will just longjmp straight out of
the cpu_ldl_code()) if that other page isn't actually mapped
in the guest's page table. You need to be careful not to access
code outside the page you're actually on unless you're really
going to execute it and are OK with it faulting.

I'm not sure what the best way on QEMU to identify this kind
of multiple-instruction-sequence is -- Richard might have an
idea. One approach would be to always take an exception
(or call a helper function) for the 'ebreak', and then
distinguish semihosting from other kinds of ebreak by doing
the load of the preceding/succeeding instructions at runtime
rather than translate time.

Does your semihosting spec expect to have the semihosting
call work if the sequence crosses a page boundary, the
code is being executed by a userspace process, and one of
the two pages has been paged out by the OS ?

thanks
-- PMM

Keith Packard Nov. 14, 2019, 5:39 p.m. UTC | #14

Peter Maydell <peter.maydell@linaro.org> writes:

> That defines the instruction sequence used to make a semihosting
> call, but not the specification of what the calls are:
>  * what call numbers perform which functions
>  * how arguments are passed to the call (registers? parameter
>    blocks in memory? other?)
>  * the semantics of each function supported (number of arguments,
>    behaviour, error handling)
>
> That's really what I had in mind by the overall semihosting spec.

There isn't anything more provided by the RISC-V foundation at this
point. I'm not sure what else they *should* provide as the goal is to
match the ARM design, which does specify all of the above.

Keith Packard Nov. 14, 2019, 5:43 p.m. UTC | #15

Alistair Francis <alistair23@gmail.com> writes:

> This sounds like something that the platform spec should contain.

I'm frankly happy with it specifying the semantics by reference to the
ARM docs -- that way we can easily share existing code without concern
about subtle semantic differences.

The only thing that would be useful is to clarify the low-level ABI
details about argument passing, but given that the ARM semihosting spec
passes arguments in the standard registers, extrapolating that to RISC-V
is not exactly difficult.

> +Atish +Palmer who are in charge of that.
>
> @Keith maybe you should contact the platform spec group and ask them
> to look into this?

I can certainly ask to have the argument passing details clarified; I
doubt that group would be interested in adopting the whole semihosting
spec and publishing it separately.

Peter Maydell Nov. 14, 2019, 5:47 p.m. UTC | #16

On Thu, 14 Nov 2019 at 17:39, Keith Packard <keithp@keithp.com> wrote:
>
> Peter Maydell <peter.maydell@linaro.org> writes:
>
> > That defines the instruction sequence used to make a semihosting
> > call, but not the specification of what the calls are:
> >  * what call numbers perform which functions
> >  * how arguments are passed to the call (registers? parameter
> >    blocks in memory? other?)
> >  * the semantics of each function supported (number of arguments,
> >    behaviour, error handling)
> >
> > That's really what I had in mind by the overall semihosting spec.
>
> There isn't anything more provided by the RISC-V foundation at this
> point. I'm not sure what else they *should* provide as the goal is to
> match the ARM design, which does specify all of the above.

This isn't obvious if you just say "semihosting".
QEMU currently provides 'semihosting' ABIs for:
 * arm
 * lm32
 * m68k
 * mips
 * nios2
 * xtensa

m68k and nios2 provide basically the same set of calls,
but all the other architectures differe from each other.
"Semihosting" just means "the concept of an ABI from guest
to the debugger or emulator", not a particular ABI.

The ARM semihosting ABI also has a number of warts
which are basically historical legacy. With a clean
sheet you get to avoid some of them. (Notably you could
skip the whole 'negotiate presence of extensions' business
by just getting those parts right from the start. Actually
specifying errno values would also be sensible and is
something the ARM spec sadly does not do and can't really
at this point with multiple implementations in play.)

thanks
-- PMM

Keith Packard Nov. 14, 2019, 6:05 p.m. UTC | #17

Peter Maydell <peter.maydell@linaro.org> writes:

> I had an idle glance at this implementation, and this:
>
>    uint32_t pre = opcode_at(&ctx->base, ctx->base.pc_next - 4);
>    uint32_t ebreak = opcode_at(&ctx->base, ctx->base.pc_next);
>    uint32_t post = opcode_at(&ctx->base, ctx->base.pc_next + 4);
>
> (where opcode_at() is a wrapper for cpu_ldl_code()) has
> some unfortunate side effects: if the previous instruction
> is in the previous MMU page, or the following instruction
> is in the next MMU page, you might incorrectly trigger
> an exception (where QEMU will just longjmp straight out of
> the cpu_ldl_code()) if that other page isn't actually mapped
> in the guest's page table. You need to be careful not to access
> code outside the page you're actually on unless you're really
> going to execute it and are OK with it faulting.

I can't even find the implementation of cpu_ldl_code; the qemu source
code is somewhat obscure in this area. But, longjmp'ing out of the
middle of that seems like a bad idea.

> Does your semihosting spec expect to have the semihosting
> call work if the sequence crosses a page boundary, the
> code is being executed by a userspace process, and one of
> the two pages has been paged out by the OS ?

You've seen the entirety of the RISC-V semihosting spec already.  For
now, perhaps we should limit RISC-V semihosting support to devices
without paging support and await a more complete spec.

As you suggest, disallowing the sequence from crossing a page boundary
would be a simple fix, but that would require wording changes to the
spec.

Peter Maydell Nov. 14, 2019, 6:18 p.m. UTC | #18

On Thu, 14 Nov 2019 at 18:05, Keith Packard <keithp@keithp.com> wrote:
>
> Peter Maydell <peter.maydell@linaro.org> writes:
>
> > I had an idle glance at this implementation, and this:
> >
> >    uint32_t pre = opcode_at(&ctx->base, ctx->base.pc_next - 4);
> >    uint32_t ebreak = opcode_at(&ctx->base, ctx->base.pc_next);
> >    uint32_t post = opcode_at(&ctx->base, ctx->base.pc_next + 4);
> >
> > (where opcode_at() is a wrapper for cpu_ldl_code()) has
> > some unfortunate side effects: if the previous instruction
> > is in the previous MMU page, or the following instruction
> > is in the next MMU page, you might incorrectly trigger
> > an exception (where QEMU will just longjmp straight out of
> > the cpu_ldl_code()) if that other page isn't actually mapped
> > in the guest's page table. You need to be careful not to access
> > code outside the page you're actually on unless you're really
> > going to execute it and are OK with it faulting.
>
> I can't even find the implementation of cpu_ldl_code; the qemu source
> code is somewhat obscure in this area. But, longjmp'ing out of the
> middle of that seems like a bad idea.

It's the way QEMU works -- generally load/store operations
that work on virtual addresses are expected to only return
in the success case; on failure they longjmp out to cause
the guest exception. (Load/stores on physical addresses
generally return a memory transaction status for the caller
to check and handle.) I agree that within the translation code
it's a bit weird and it might be nicer for the translate.c
code to explicitly handle failures to load an insn, but it
would be a bit of an upheaval to try to rewrite it at this point.

cpu_ldl_code() is provided by include/exec/cpu_ldst.h, incidentally
(via preprocessor macros and repeated inclusion of some template
.h files, which is why a grep for the function name misses it).

> > Does your semihosting spec expect to have the semihosting
> > call work if the sequence crosses a page boundary, the
> > code is being executed by a userspace process, and one of
> > the two pages has been paged out by the OS ?
>
> You've seen the entirety of the RISC-V semihosting spec already.  For
> now, perhaps we should limit RISC-V semihosting support to devices
> without paging support and await a more complete spec.
>
> As you suggest, disallowing the sequence from crossing a page boundary
> would be a simple fix, but that would require wording changes to the
> spec.

Yeah, I'm implicitly suggesting that a bit more thought/revision
of the spec might not be a bad idea. Things that are
effectively supposed to be treated like a single instruction
but which can cross cacheline or page boundaries turn out
to be a fertile source of implementation bugs. (The arm
translate.c code has to be quite careful about handling
32-bit Thumb insns that cross pages. The x86 translate.c
code is less careful and may well be buggy in this area.)

thanks
-- PMM

Richard Henderson Nov. 14, 2019, 7:18 p.m. UTC | #19

On 11/14/19 5:14 PM, Peter Maydell wrote:
> On Fri, 25 Oct 2019 at 20:15, Keith Packard <keithp@keithp.com> wrote:
>> There seems to be convergence on a pretty simple interface which uses
>> ebreak surrounded by a couple of specific no-ops:
>>
>>       slli x0, x0, 0x1f
>>       ebreak
>>       srai x0, x0, 0x7
>>
>> There are implementations in rust and openocd, and I've got one for
>> picolibc. The risc-v semihosting code is sitting on a branch in my repo
>> on github:
>>
>>         https://github.com/keith-packard/qemu/tree/riscv-semihost
> 
> I had an idle glance at this implementation, and this:
> 
>    uint32_t pre = opcode_at(&ctx->base, ctx->base.pc_next - 4);
>    uint32_t ebreak = opcode_at(&ctx->base, ctx->base.pc_next);
>    uint32_t post = opcode_at(&ctx->base, ctx->base.pc_next + 4);
> 
> (where opcode_at() is a wrapper for cpu_ldl_code()) has
> some unfortunate side effects: if the previous instruction
> is in the previous MMU page, or the following instruction
> is in the next MMU page, you might incorrectly trigger
> an exception (where QEMU will just longjmp straight out of
> the cpu_ldl_code()) if that other page isn't actually mapped
> in the guest's page table. You need to be careful not to access
> code outside the page you're actually on unless you're really
> going to execute it and are OK with it faulting.
> 
> I'm not sure what the best way on QEMU to identify this kind
> of multiple-instruction-sequence is -- Richard might have an
> idea. One approach would be to always take an exception
> (or call a helper function) for the 'ebreak', and then
> distinguish semihosting from other kinds of ebreak by doing
> the load of the preceding/succeeding instructions at runtime
> rather than translate time.

It's irritating that this 3 insn sequence is already a thing, baked into other
sim/emulators.  Seems to me that it would have been easier to abscond with

    ebreak + func3 field != 0.

For semi-hosting, it seems even better if the semi-hosting syscall instruction
is not "real", because you're explicitly requesting services from "unreal"
hardware.  It should be specified to generate a SIGILL type of exception
anywhere semi-hosting is not enabled.

That said, it is possible to do this in the translator, by considering this
sequence to be one 12-byte instruction.

You'd build in a recognizer into the decoder such that, when semi-hosting is
enabled, seeing the entry nop starts looking ahead:

  - If this is not the first insn in the TB, and there are
    not 8 more bytes remaining on the page, terminate the
    current TB.  This will re-start the machine in the next TB.

  - If the complete sequence is not found, then advance the pc
    past the entry nop and let nature take its course wrt the
    subsequent instructions.

  - If the sequence crosses a page, then so be it.  Because of
    step 1, this only happens when we *must* cross a page, and
    will have recognized any paging exception anyway.
    The generic parts of qemu will handle proper invalidation of
    a TB that crosses a page boundary.

  - This is kinda similar to the target/sh4 code to handle
    generic user-space atomic sequences: c.f. decode_gusa().

  - This implementation does mean that you can't jump directly
    to the ebreak insn and have the sequence be recognized.
    Control must flow through the entry nop.
    This is a bug or a feature depending on the reviewer.  ;-)

With that in mind, it may be simpler to handle all of this not in the
translator, but in the function that delivers the ebreak exception.  At that
point one can arrange to read memory without raising additional exceptions.


r~

Peter Maydell Nov. 14, 2019, 7:20 p.m. UTC | #20

On Thu, 14 Nov 2019 at 17:47, Peter Maydell <peter.maydell@linaro.org> wrote:

> The ARM semihosting ABI also has a number of warts
> which are basically historical legacy. With a clean
> sheet you get to avoid some of them. (Notably you could
> skip the whole 'negotiate presence of extensions' business
> by just getting those parts right from the start

...better still, if you can define (a) a mandatory "return
version and feature bit info" call right from the start
and (b) the required behaviour for attempts to make
a semihosting call with an unknown/unimplemented
function number -- then you can avoid the nasty
kludge "magic behaviour by opening a magic filename"
we were stuck with specifying in arm semihosting 2.0.

thanks
-- PMM

Peter Maydell Nov. 14, 2019, 7:29 p.m. UTC | #21

On Thu, 14 Nov 2019 at 19:18, Richard Henderson
<richard.henderson@linaro.org> wrote:
>   - If the sequence crosses a page, then so be it.  Because of
>     step 1, this only happens when we *must* cross a page, and
>     will have recognized any paging exception anyway.
>     The generic parts of qemu will handle proper invalidation of
>     a TB that crosses a page boundary.

I'm not sure this would work. If you have
  insn1 insn2 || other-insn
(where || is the page boundary and page 2 is non-executable)
then the required behaviour is "execute insn1 and insn2 with
normal behaviour, then fault trying to read other-insn, with
the fault address being that of other-insn".
Whereas for
  insn1 insn2 || insn3
you want to treat it as a semihosting sequence. But you can't distinguish
the two because trying to read the word in page 2 will cause us to
generate a fault with the fault address being that of insn1. Or
have I forgotten how the page-crossing handling works ?

thanks
-- PMM

Richard Henderson Nov. 14, 2019, 8:52 p.m. UTC | #22

On 11/14/19 8:29 PM, Peter Maydell wrote:
> On Thu, 14 Nov 2019 at 19:18, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>   - If the sequence crosses a page, then so be it.  Because of
>>     step 1, this only happens when we *must* cross a page, and
>>     will have recognized any paging exception anyway.
>>     The generic parts of qemu will handle proper invalidation of
>>     a TB that crosses a page boundary.
> 
> I'm not sure this would work. If you have
>   insn1 insn2 || other-insn
> (where || is the page boundary and page 2 is non-executable)
> then the required behaviour is "execute insn1 and insn2 with
> normal behaviour, then fault trying to read other-insn, with
> the fault address being that of other-insn".
> Whereas for
>   insn1 insn2 || insn3
> you want to treat it as a semihosting sequence. But you can't distinguish
> the two because trying to read the word in page 2 will cause us to
> generate a fault with the fault address being that of insn1. Or
> have I forgotten how the page-crossing handling works ?

Yet another reason why I prefer any semi-hosting call to use an encoding that
is otherwise reserved illegal.

For this, you have to make up your mind: is it important to execute the
instructions as specified by the ISA, or as specified by the semi-hosting spec?

In this case, semi-hosting defines an "entry nop" that begins the sequence, and
I think that we are well within our rights to ignore the validity of "insn1
insn2 || other-insn".


r~

Peter Maydell Nov. 14, 2019, 9:04 p.m. UTC | #23

On Thu, 14 Nov 2019 at 20:52, Richard Henderson
<richard.henderson@linaro.org> wrote:
> Yet another reason why I prefer any semi-hosting call to use an encoding that
> is otherwise reserved illegal.
>
> For this, you have to make up your mind: is it important to execute the
> instructions as specified by the ISA, or as specified by the semi-hosting spec?
>
> In this case, semi-hosting defines an "entry nop" that begins the sequence, and
> I think that we are well within our rights to ignore the validity of "insn1
> insn2 || other-insn".

Perhaps. I think you get the same issue with
  insn1 || insn2
vs
  insn1 || some-other-insn

though. (And the spec has wording that explicitly wants the
latter to be handled with the normal "I'm a hint instruction"
behaviour of insn1.)

-- PMM

Keith Packard Nov. 14, 2019, 10:26 p.m. UTC | #24

Richard Henderson <richard.henderson@linaro.org> writes:

> For semi-hosting, it seems even better if the semi-hosting syscall instruction
> is not "real", because you're explicitly requesting services from "unreal"
> hardware.  It should be specified to generate a SIGILL type of exception
> anywhere semi-hosting is not enabled.

In the QEMU case, yes, it's virtual hardware. However, the other common case
for semihosting is when doing hardware bringup using JTAG or other
remote debugging link -- having an instruction which safely traps to the
debugger is required to make semihosting usable there. Reading through
the history of the current RISC-V semihosting mechanism, there were many
designs considered and rejected because they wouldn't work in the JTAG
environment.

> With that in mind, it may be simpler to handle all of this not in the
> translator, but in the function that delivers the ebreak exception.  At that
> point one can arrange to read memory without raising additional
> exceptions.

I'll go explore and see if I can figure any of this out.

I'd still like to get the non-RISC-V SYS_READC patch landed someday :-)

Peter Maydell Nov. 15, 2019, 10:54 a.m. UTC | #25

On Thu, 14 Nov 2019 at 22:27, Keith Packard <keithp@keithp.com> wrote:
>
> Richard Henderson <richard.henderson@linaro.org> writes:
>
> > For semi-hosting, it seems even better if the semi-hosting syscall instruction
> > is not "real", because you're explicitly requesting services from "unreal"
> > hardware.  It should be specified to generate a SIGILL type of exception
> > anywhere semi-hosting is not enabled.
>
> In the QEMU case, yes, it's virtual hardware. However, the other common case
> for semihosting is when doing hardware bringup using JTAG or other
> remote debugging link -- having an instruction which safely traps to the
> debugger is required to make semihosting usable there. Reading through
> the history of the current RISC-V semihosting mechanism, there were many
> designs considered and rejected because they wouldn't work in the JTAG
> environment.
>
> > With that in mind, it may be simpler to handle all of this not in the
> > translator, but in the function that delivers the ebreak exception.  At that
> > point one can arrange to read memory without raising additional
> > exceptions.
>
> I'll go explore and see if I can figure any of this out.
>
> I'd still like to get the non-RISC-V SYS_READC patch landed someday :-)

It's on my queue to review if nobody else gets to it first, but since
we're in freeze right now it won't be landing til after the release
happens (expected mid-December).

thanks
-- PMM

Keith Packard Nov. 15, 2019, 11:40 p.m. UTC | #26

Peter Maydell <peter.maydell@linaro.org> writes:

> It's on my queue to review if nobody else gets to it first, but since
> we're in freeze right now it won't be landing til after the release
> happens (expected mid-December).

Thanks in advance! I'll get started pushing questions about the RISC-V
semihosting ABI into that standard group and see if we can't at least
have the existing situation clarified. I think at a minimum we need:

 1) Explicit reference to the intended ARM API, with an explicit mapping
    between ARM architecture concepts to the ones used for RISC-V,
    especially for how arguments are passed.

 2) Resolution of how to handle sequence which cross page boundaries.

Semihost SYS_READC implementation (v4)

Commit Message

Comments

Patch