diff mbox series

[v2,1/1] modules: Improve error message when module is not found

Message ID 20210722220952.17444-2-jziviani@suse.de (mailing list archive)
State New, archived
Headers show
Series Improve module accelerator error message | expand

Commit Message

Jose R. Ziviani July 22, 2021, 10:09 p.m. UTC
When a module is not found, specially accelerators, QEMU displays
a error message that not easy to understand[1]. This patch improves
the readability by offering a user-friendly message[2].

This patch also moves the accelerator ops check to runtine (instead
of the original g_assert) because it works better with dynamic
modules.

[1] qemu-system-x86_64 -accel tcg
ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed:
(ops != NULL)
Bail out! ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces:
assertion failed: (ops != NULL)
    31964 IOT instruction (core dumped)  ./qemu-system-x86_64 ...

[2] qemu-system-x86_64 -accel tcg
accel-tcg-x86_64 module is missing, install the package or config the library path correctly.

Signed-off-by: Jose R. Ziviani <jziviani@suse.de>
---
 accel/accel-softmmu.c |  5 ++++-
 util/module.c         | 14 ++++++++------
 2 files changed, 12 insertions(+), 7 deletions(-)

Comments

Gerd Hoffmann July 23, 2021, 6:25 a.m. UTC | #1
Hi,

> --- a/accel/accel-softmmu.c
> +++ b/accel/accel-softmmu.c
> @@ -79,7 +79,10 @@ void accel_init_ops_interfaces(AccelClass *ac)
>       * all accelerators need to define ops, providing at least a mandatory
>       * non-NULL create_vcpu_thread operation.
>       */
> -    g_assert(ops != NULL);
> +    if (ops == NULL) {

Error message here?
Also split accel bits into a separate patch?

>          g_hash_table_remove(loaded_modules, module_name);
> +        fprintf(stderr, "%s module is missing, install the "
> +                        "package or config the library path "
> +                        "correctly.\n", module_name);

This should be error_report(), or maybe warn_report() as this
isn't a fatal error in all cases.  Otherwise looks good to me.

take care,
  Gerd
Claudio Fontana July 23, 2021, 8:04 a.m. UTC | #2
On 7/23/21 8:25 AM, Gerd Hoffmann wrote:
>   Hi,
> 
>> --- a/accel/accel-softmmu.c
>> +++ b/accel/accel-softmmu.c
>> @@ -79,7 +79,10 @@ void accel_init_ops_interfaces(AccelClass *ac)
>>       * all accelerators need to define ops, providing at least a mandatory
>>       * non-NULL create_vcpu_thread operation.
>>       */
>> -    g_assert(ops != NULL);
>> +    if (ops == NULL) {
> 
> Error message here?> Also split accel bits into a separate patch?

If ops is NULL there is something seriously wrong with the code assumptions.
Asserts are in my view the proper way to handle these.

I do not think there is any reason to change this.

> 
>>          g_hash_table_remove(loaded_modules, module_name);
>> +        fprintf(stderr, "%s module is missing, install the "
>> +                        "package or config the library path "
>> +                        "correctly.\n", module_name);
> 
> This should be error_report(), or maybe warn_report() as this
> isn't a fatal error in all cases.  Otherwise looks good to me.
> 
> take care,
>   Gerd
>
Markus Armbruster July 23, 2021, 8:28 a.m. UTC | #3
"Jose R. Ziviani" <jziviani@suse.de> writes:

> When a module is not found, specially accelerators, QEMU displays
> a error message that not easy to understand[1]. This patch improves
> the readability by offering a user-friendly message[2].
>
> This patch also moves the accelerator ops check to runtine (instead
> of the original g_assert) because it works better with dynamic
> modules.
>
> [1] qemu-system-x86_64 -accel tcg
> ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed:
> (ops != NULL)
> Bail out! ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces:
> assertion failed: (ops != NULL)
>     31964 IOT instruction (core dumped)  ./qemu-system-x86_64 ...

This isn't an error message, it's a crash :)

> [2] qemu-system-x86_64 -accel tcg
> accel-tcg-x86_64 module is missing, install the package or config the library path correctly.

s/config/configure/

Also drop the period.

>
> Signed-off-by: Jose R. Ziviani <jziviani@suse.de>
> ---
>  accel/accel-softmmu.c |  5 ++++-
>  util/module.c         | 14 ++++++++------
>  2 files changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/accel/accel-softmmu.c b/accel/accel-softmmu.c
> index 67276e4f52..52449ac2d0 100644
> --- a/accel/accel-softmmu.c
> +++ b/accel/accel-softmmu.c
> @@ -79,7 +79,10 @@ void accel_init_ops_interfaces(AccelClass *ac)
>       * all accelerators need to define ops, providing at least a mandatory
>       * non-NULL create_vcpu_thread operation.
>       */
> -    g_assert(ops != NULL);
> +    if (ops == NULL) {
> +        exit(1);
> +    }
> +

Not your patch's fault: I'm not sure the comment makes sense.

>      if (ops->ops_init) {
>          ops->ops_init(ops);
>      }
> diff --git a/util/module.c b/util/module.c
> index 6bb4ad915a..268a8563fd 100644
> --- a/util/module.c
> +++ b/util/module.c
> @@ -206,13 +206,10 @@ static int module_load_file(const char *fname, bool mayfail, bool export_symbols
>  out:
>      return ret;
>  }
> -#endif

Why do you need to mess with the ifdeffery?

>  
>  bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
>  {
>      bool success = false;
> -
> -#ifdef CONFIG_MODULES
>      char *fname = NULL;
>  #ifdef CONFIG_MODULE_UPGRADES
>      char *version_dir;
> @@ -300,6 +297,9 @@ bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
>  
>      if (!success) {
>          g_hash_table_remove(loaded_modules, module_name);
> +        fprintf(stderr, "%s module is missing, install the "
> +                        "package or config the library path "
> +                        "correctly.\n", module_name);
>          g_free(module_name);
>      }
>  

Again, not your patch's fault: reporting to stderr with fprintf() is
almost always wrong.

When the thing we report is an error, we should use error_report() for
correct formatting.  Likewise, warn_report() for warnings, info_report()
for informational messages.

When the module load is triggered by a monitor command, we probably want
to report problems to the monitor.  error_report() & friends do the
right thing for HMP.  For QMP, you have to use the Error API, i.e. have
the function take an Error ** argument, which the callers propagate all
the way to the QMP core.

To fix this issue, we first need to decide what kind of message this is:
error, warning, something else.

> @@ -307,12 +307,9 @@ bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
>          g_free(dirs[i]);
>      }
>  
> -#endif
>      return success;
>  }
>  
> -#ifdef CONFIG_MODULES
> -
>  static bool module_loaded_qom_all;
>  
>  void module_load_qom_one(const char *type)
> @@ -384,4 +381,9 @@ void qemu_load_module_for_opts(const char *group) {}
>  void module_load_qom_one(const char *type) {}
>  void module_load_qom_all(void) {}
>  
> +bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
> +{
> +    return false;
> +}
> +
>  #endif
Claudio Fontana July 23, 2021, 8:32 a.m. UTC | #4
Hello Markus

On 7/23/21 10:28 AM, Markus Armbruster wrote:
> "Jose R. Ziviani" <jziviani@suse.de> writes:
> 
>> When a module is not found, specially accelerators, QEMU displays
>> a error message that not easy to understand[1]. This patch improves
>> the readability by offering a user-friendly message[2].
>>
>> This patch also moves the accelerator ops check to runtine (instead
>> of the original g_assert) because it works better with dynamic
>> modules.
>>
>> [1] qemu-system-x86_64 -accel tcg
>> ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed:
>> (ops != NULL)
>> Bail out! ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces:
>> assertion failed: (ops != NULL)
>>     31964 IOT instruction (core dumped)  ./qemu-system-x86_64 ...
> 
> This isn't an error message, it's a crash :)


Yes, and the reason of the crash is that the code that needs to initialize ops is missing.

The assert needs to _stay there_,
this is a symptom, not the cause of the problem,

we should not be blindly removing asserts.


> 
>> [2] qemu-system-x86_64 -accel tcg
>> accel-tcg-x86_64 module is missing, install the package or config the library path correctly.
> 
> s/config/configure/
> 
> Also drop the period.
> 
>>
>> Signed-off-by: Jose R. Ziviani <jziviani@suse.de>
>> ---
>>  accel/accel-softmmu.c |  5 ++++-
>>  util/module.c         | 14 ++++++++------
>>  2 files changed, 12 insertions(+), 7 deletions(-)
>>
>> diff --git a/accel/accel-softmmu.c b/accel/accel-softmmu.c
>> index 67276e4f52..52449ac2d0 100644
>> --- a/accel/accel-softmmu.c
>> +++ b/accel/accel-softmmu.c
>> @@ -79,7 +79,10 @@ void accel_init_ops_interfaces(AccelClass *ac)
>>       * all accelerators need to define ops, providing at least a mandatory
>>       * non-NULL create_vcpu_thread operation.
>>       */
>> -    g_assert(ops != NULL);
>> +    if (ops == NULL) {
>> +        exit(1);
>> +    }
>> +
> 
> Not your patch's fault: I'm not sure the comment makes sense.


Yes, the comment makes sense.


> 
>>      if (ops->ops_init) {
>>          ops->ops_init(ops);
>>      }
>> diff --git a/util/module.c b/util/module.c
>> index 6bb4ad915a..268a8563fd 100644
>> --- a/util/module.c
>> +++ b/util/module.c
>> @@ -206,13 +206,10 @@ static int module_load_file(const char *fname, bool mayfail, bool export_symbols
>>  out:
>>      return ret;
>>  }
>> -#endif
> 
> Why do you need to mess with the ifdeffery?
> 
>>  
>>  bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
>>  {
>>      bool success = false;
>> -
>> -#ifdef CONFIG_MODULES
>>      char *fname = NULL;
>>  #ifdef CONFIG_MODULE_UPGRADES
>>      char *version_dir;
>> @@ -300,6 +297,9 @@ bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
>>  
>>      if (!success) {
>>          g_hash_table_remove(loaded_modules, module_name);
>> +        fprintf(stderr, "%s module is missing, install the "
>> +                        "package or config the library path "
>> +                        "correctly.\n", module_name);
>>          g_free(module_name);
>>      }
>>  
> 
> Again, not your patch's fault: reporting to stderr with fprintf() is
> almost always wrong.
> 
> When the thing we report is an error, we should use error_report() for
> correct formatting.  Likewise, warn_report() for warnings, info_report()
> for informational messages.
> 
> When the module load is triggered by a monitor command, we probably want
> to report problems to the monitor.  error_report() & friends do the
> right thing for HMP.  For QMP, you have to use the Error API, i.e. have
> the function take an Error ** argument, which the callers propagate all
> the way to the QMP core.
> 
> To fix this issue, we first need to decide what kind of message this is:
> error, warning, something else.
> 
>> @@ -307,12 +307,9 @@ bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
>>          g_free(dirs[i]);
>>      }
>>  
>> -#endif
>>      return success;
>>  }
>>  
>> -#ifdef CONFIG_MODULES
>> -
>>  static bool module_loaded_qom_all;
>>  
>>  void module_load_qom_one(const char *type)
>> @@ -384,4 +381,9 @@ void qemu_load_module_for_opts(const char *group) {}
>>  void module_load_qom_one(const char *type) {}
>>  void module_load_qom_all(void) {}
>>  
>> +bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
>> +{
>> +    return false;
>> +}
>> +
>>  #endif
>
Claudio Fontana July 23, 2021, 9:41 a.m. UTC | #5
On 7/23/21 12:09 AM, Jose R. Ziviani wrote:
> When a module is not found, specially accelerators, QEMU displays
> a error message that not easy to understand[1]. This patch improves
> the readability by offering a user-friendly message[2].
> 
> This patch also moves the accelerator ops check to runtine (instead
> of the original g_assert) because it works better with dynamic
> modules.
> 
> [1] qemu-system-x86_64 -accel tcg
> ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed:
> (ops != NULL)
> Bail out! ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces:
> assertion failed: (ops != NULL)
>     31964 IOT instruction (core dumped)  ./qemu-system-x86_64 ...
> 
> [2] qemu-system-x86_64 -accel tcg
> accel-tcg-x86_64 module is missing, install the package or config the library path correctly.
> 
> Signed-off-by: Jose R. Ziviani <jziviani@suse.de>
> ---
>  accel/accel-softmmu.c |  5 ++++-
>  util/module.c         | 14 ++++++++------
>  2 files changed, 12 insertions(+), 7 deletions(-)
> 
> diff --git a/accel/accel-softmmu.c b/accel/accel-softmmu.c
> index 67276e4f52..52449ac2d0 100644
> --- a/accel/accel-softmmu.c
> +++ b/accel/accel-softmmu.c
> @@ -79,7 +79,10 @@ void accel_init_ops_interfaces(AccelClass *ac)
>       * all accelerators need to define ops, providing at least a mandatory
>       * non-NULL create_vcpu_thread operation.
>       */
> -    g_assert(ops != NULL);
> +    if (ops == NULL) {
> +        exit(1);
> +    }
> +


Ah, again, why?
This change looks wrong to me, 

the ops code should be present when ops interfaces are initialized:
it should be a code level assertion, as it has to do with the proper order of initializations in QEMU,

why would we want to do anything else but to assert here?

Am I blind to something obvious?

>      if (ops->ops_init) {
>          ops->ops_init(ops);
>      }
> diff --git a/util/module.c b/util/module.c
> index 6bb4ad915a..268a8563fd 100644
> --- a/util/module.c
> +++ b/util/module.c
> @@ -206,13 +206,10 @@ static int module_load_file(const char *fname, bool mayfail, bool export_symbols
>  out:
>      return ret;
>  }
> -#endif
>  
>  bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
>  {
>      bool success = false;
> -
> -#ifdef CONFIG_MODULES
>      char *fname = NULL;
>  #ifdef CONFIG_MODULE_UPGRADES
>      char *version_dir;
> @@ -300,6 +297,9 @@ bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
>  
>      if (!success) {
>          g_hash_table_remove(loaded_modules, module_name);
> +        fprintf(stderr, "%s module is missing, install the "
> +                        "package or config the library path "
> +                        "correctly.\n", module_name);
>          g_free(module_name);
>      }
>  
> @@ -307,12 +307,9 @@ bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
>          g_free(dirs[i]);
>      }
>  
> -#endif
>      return success;
>  }
>  
> -#ifdef CONFIG_MODULES
> -
>  static bool module_loaded_qom_all;
>  
>  void module_load_qom_one(const char *type)
> @@ -384,4 +381,9 @@ void qemu_load_module_for_opts(const char *group) {}
>  void module_load_qom_one(const char *type) {}
>  void module_load_qom_all(void) {}
>  
> +bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
> +{
> +    return false;
> +}
> +
>  #endif
>
Gerd Hoffmann July 23, 2021, 9:52 a.m. UTC | #6
> > -    g_assert(ops != NULL);
> > +    if (ops == NULL) {
> > +        exit(1);
> > +    }
> > +
> 
> 
> Ah, again, why?
> This change looks wrong to me, 
> 
> the ops code should be present when ops interfaces are initialized:
> it should be a code level assertion, as it has to do with the proper order of initializations in QEMU,
> 
> why would we want to do anything else but to assert here?
> 
> Am I blind to something obvious?

Building tcg accel ops modular moves that from coding error to possible
user error (user wants use tcg but has qemu-accel-tcg-$arch.rpm not
installed).

The second part of the patch makes qemu print a message on the failed
module load, so the user would have a chance to figure where the assert
comes from, but replacing the assert with a more friendly message still
makes sense to me.

take care,
  Gerd
Claudio Fontana July 23, 2021, 11:20 a.m. UTC | #7
On 7/23/21 11:52 AM, Gerd Hoffmann wrote:
>>> -    g_assert(ops != NULL);
>>> +    if (ops == NULL) {
>>> +        exit(1);
>>> +    }
>>> +
>>
>>
>> Ah, again, why?
>> This change looks wrong to me, 
>>
>> the ops code should be present when ops interfaces are initialized:
>> it should be a code level assertion, as it has to do with the proper order of initializations in QEMU,
>>
>> why would we want to do anything else but to assert here?
>>
>> Am I blind to something obvious?
> 
> Building tcg accel ops modular moves that from coding error to possible
> user error (user wants use tcg but has qemu-accel-tcg-$arch.rpm not
> installed).

Sorry but without more background I don't buy it.

If ops is null at the time accel_init_interfaces is called,
it means that we are trying to initialize the board (for softmmu)
with an accelerator already selected, and without an accelerator actually available.

The problem has happened already a long time before we get here.

When we check for viable accelerators, in configure_accelerators,
we should check that the code is actually there, before choosing it as a viable accelerator.

If we march on and start initializing the machine with an accelerator that is not available,
of course things will start failing left and right.

If things like:

bool have_tcg = accel_find("tcg");

return true when the code is actually not there, there seems to be a larger issue to solve.

I think we need to think more broadly about this.

Thanks,

Claudio


> 
> The second part of the patch makes qemu print a message on the failed
> module load, so the user would have a chance to figure where the assert
> comes from, but replacing the assert with a more friendly message still
> makes sense to me.
> 
> take care,
>   Gerd
>
Claudio Fontana July 23, 2021, 12:20 p.m. UTC | #8
On 7/23/21 1:20 PM, Claudio Fontana wrote:
> On 7/23/21 11:52 AM, Gerd Hoffmann wrote:
>>>> -    g_assert(ops != NULL);
>>>> +    if (ops == NULL) {
>>>> +        exit(1);
>>>> +    }
>>>> +
>>>
>>>
>>> Ah, again, why?
>>> This change looks wrong to me, 
>>>
>>> the ops code should be present when ops interfaces are initialized:
>>> it should be a code level assertion, as it has to do with the proper order of initializations in QEMU,
>>>
>>> why would we want to do anything else but to assert here?
>>>
>>> Am I blind to something obvious?
>>
>> Building tcg accel ops modular moves that from coding error to possible
>> user error (user wants use tcg but has qemu-accel-tcg-$arch.rpm not
>> installed).
> 
> Sorry but without more background I don't buy it.
> 
> If ops is null at the time accel_init_interfaces is called,
> it means that we are trying to initialize the board (for softmmu)
> with an accelerator already selected, and without an accelerator actually available.
> 
> The problem has happened already a long time before we get here.
> 
> When we check for viable accelerators, in configure_accelerators,
> we should check that the code is actually there, before choosing it as a viable accelerator.
> 
> If we march on and start initializing the machine with an accelerator that is not available,
> of course things will start failing left and right.
> 
> If things like:
> 
> bool have_tcg = accel_find("tcg");
> 
> return true when the code is actually not there, there seems to be a larger issue to solve.
> 
> I think we need to think more broadly about this.

Overall, building the whole code base to be modular,
and then _not_ including unwanted modules in the base distro package,

is the whole point of going through this at all.

QEMU should gracefully figure out that indeed, the module is not there -> TCG is not there.

So we need more work to make this actually work right.

> Thanks,
> 
> Claudio
> 
> 
>>
>> The second part of the patch makes qemu print a message on the failed
>> module load, so the user would have a chance to figure where the assert
>> comes from, but replacing the assert with a more friendly message still
>> makes sense to me.
>>
>> take care,
>>   Gerd
>>
> 
>
Gerd Hoffmann July 23, 2021, 12:48 p.m. UTC | #9
> > bool have_tcg = accel_find("tcg");
> > 
> > return true when the code is actually not there, there seems to be a larger issue to solve.
> > 
> > I think we need to think more broadly about this.
> 
> Overall, building the whole code base to be modular,
> and then _not_ including unwanted modules in the base distro package,
> is the whole point of going through this at all.

Yes.

Right now we are only half-through.  tcg-accel-ops is modular, but
tcg-accel is not (yet).  Which I guess is why the assert() can trigger
now.

So add a patch with error message and a FIXME comment, which we can
revert when isn't needed any more?

> So we need more work to make this actually work right.

Yes.  I want have all of tcg in the tcg accel module, not only parts of
it, but that needs some more refactoring.  I'll go start looking at this
once I managed to wade through my vacation backlog.

take care,
  Gerd
Jose R. Ziviani July 23, 2021, 1:50 p.m. UTC | #10
On Fri, Jul 23, 2021 at 11:41:19AM +0200, Claudio Fontana wrote:
> On 7/23/21 12:09 AM, Jose R. Ziviani wrote:
> > When a module is not found, specially accelerators, QEMU displays
> > a error message that not easy to understand[1]. This patch improves
> > the readability by offering a user-friendly message[2].
> > 
> > This patch also moves the accelerator ops check to runtine (instead
> > of the original g_assert) because it works better with dynamic
> > modules.
> > 
> > [1] qemu-system-x86_64 -accel tcg
> > ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed:
> > (ops != NULL)
> > Bail out! ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces:
> > assertion failed: (ops != NULL)
> >     31964 IOT instruction (core dumped)  ./qemu-system-x86_64 ...
> > 
> > [2] qemu-system-x86_64 -accel tcg
> > accel-tcg-x86_64 module is missing, install the package or config the library path correctly.
> > 
> > Signed-off-by: Jose R. Ziviani <jziviani@suse.de>
> > ---
> >  accel/accel-softmmu.c |  5 ++++-
> >  util/module.c         | 14 ++++++++------
> >  2 files changed, 12 insertions(+), 7 deletions(-)
> > 
> > diff --git a/accel/accel-softmmu.c b/accel/accel-softmmu.c
> > index 67276e4f52..52449ac2d0 100644
> > --- a/accel/accel-softmmu.c
> > +++ b/accel/accel-softmmu.c
> > @@ -79,7 +79,10 @@ void accel_init_ops_interfaces(AccelClass *ac)
> >       * all accelerators need to define ops, providing at least a mandatory
> >       * non-NULL create_vcpu_thread operation.
> >       */
> > -    g_assert(ops != NULL);
> > +    if (ops == NULL) {
> > +        exit(1);
> > +    }
> > +
> 
> 
> Ah, again, why?
> This change looks wrong to me, 
> 
> the ops code should be present when ops interfaces are initialized:
> it should be a code level assertion, as it has to do with the proper order of initializations in QEMU,
> 
> why would we want to do anything else but to assert here?
> 
> Am I blind to something obvious?

Hello!

Thank you for reviewing it!

The problem is that if your TCG module is not installed and you start
QEMU like:

./qemu-system-x86_64 -accel tcg

You'll get the error message + a crash with a core dump:

accel-tcg-x86_64 module is missing, install the package or config the library path correctly.
**
ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed: (ops != NULL)
Bail out! ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed: (ops != NULL)
[1]    5740 IOT instruction (core dumped)  ./qemu-system-x86_64 -accel tcg

I was digging a little bit more in order to move this responsibility to
module.c but there isn't enough information there to safely exit() in
all situations that a module may be loaded. As Gerd mentioned, more work
is needed in order to achieve that.

However, it's not nice to have a crash due to an optional module missing.
It's specially confusing because TCG has always been native. Considering
also that we're already in hard freeze for 6.1, I thought to have this
simpler check instead.

What do you think if we have something like:

/* FIXME: this isn't the right place to handle a missing module and
   must be reverted when the module refactoring is completely done */
#ifdef CONFIG_MODULES
if (ops == NULL) {
    exit(1);
}
#else
g_assert(ops != NULL);
#endif

Regards!

> 
> >      if (ops->ops_init) {
> >          ops->ops_init(ops);
> >      }
> > diff --git a/util/module.c b/util/module.c
> > index 6bb4ad915a..268a8563fd 100644
> > --- a/util/module.c
> > +++ b/util/module.c
> > @@ -206,13 +206,10 @@ static int module_load_file(const char *fname, bool mayfail, bool export_symbols
> >  out:
> >      return ret;
> >  }
> > -#endif
> >  
> >  bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
> >  {
> >      bool success = false;
> > -
> > -#ifdef CONFIG_MODULES
> >      char *fname = NULL;
> >  #ifdef CONFIG_MODULE_UPGRADES
> >      char *version_dir;
> > @@ -300,6 +297,9 @@ bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
> >  
> >      if (!success) {
> >          g_hash_table_remove(loaded_modules, module_name);
> > +        fprintf(stderr, "%s module is missing, install the "
> > +                        "package or config the library path "
> > +                        "correctly.\n", module_name);
> >          g_free(module_name);
> >      }
> >  
> > @@ -307,12 +307,9 @@ bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
> >          g_free(dirs[i]);
> >      }
> >  
> > -#endif
> >      return success;
> >  }
> >  
> > -#ifdef CONFIG_MODULES
> > -
> >  static bool module_loaded_qom_all;
> >  
> >  void module_load_qom_one(const char *type)
> > @@ -384,4 +381,9 @@ void qemu_load_module_for_opts(const char *group) {}
> >  void module_load_qom_one(const char *type) {}
> >  void module_load_qom_all(void) {}
> >  
> > +bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
> > +{
> > +    return false;
> > +}
> > +
> >  #endif
> > 
>
Claudio Fontana July 23, 2021, 2:02 p.m. UTC | #11
On 7/23/21 3:50 PM, Jose R. Ziviani wrote:
> On Fri, Jul 23, 2021 at 11:41:19AM +0200, Claudio Fontana wrote:
>> On 7/23/21 12:09 AM, Jose R. Ziviani wrote:
>>> When a module is not found, specially accelerators, QEMU displays
>>> a error message that not easy to understand[1]. This patch improves
>>> the readability by offering a user-friendly message[2].
>>>
>>> This patch also moves the accelerator ops check to runtine (instead
>>> of the original g_assert) because it works better with dynamic
>>> modules.
>>>
>>> [1] qemu-system-x86_64 -accel tcg
>>> ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed:
>>> (ops != NULL)
>>> Bail out! ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces:
>>> assertion failed: (ops != NULL)
>>>     31964 IOT instruction (core dumped)  ./qemu-system-x86_64 ...
>>>
>>> [2] qemu-system-x86_64 -accel tcg
>>> accel-tcg-x86_64 module is missing, install the package or config the library path correctly.
>>>
>>> Signed-off-by: Jose R. Ziviani <jziviani@suse.de>
>>> ---
>>>  accel/accel-softmmu.c |  5 ++++-
>>>  util/module.c         | 14 ++++++++------
>>>  2 files changed, 12 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/accel/accel-softmmu.c b/accel/accel-softmmu.c
>>> index 67276e4f52..52449ac2d0 100644
>>> --- a/accel/accel-softmmu.c
>>> +++ b/accel/accel-softmmu.c
>>> @@ -79,7 +79,10 @@ void accel_init_ops_interfaces(AccelClass *ac)
>>>       * all accelerators need to define ops, providing at least a mandatory
>>>       * non-NULL create_vcpu_thread operation.
>>>       */
>>> -    g_assert(ops != NULL);
>>> +    if (ops == NULL) {
>>> +        exit(1);
>>> +    }
>>> +
>>
>>
>> Ah, again, why?
>> This change looks wrong to me, 
>>
>> the ops code should be present when ops interfaces are initialized:
>> it should be a code level assertion, as it has to do with the proper order of initializations in QEMU,
>>
>> why would we want to do anything else but to assert here?
>>
>> Am I blind to something obvious?
> 
> Hello!
> 
> Thank you for reviewing it!
> 
> The problem is that if your TCG module is not installed and you start
> QEMU like:
> 
> ./qemu-system-x86_64 -accel tcg
> 
> You'll get the error message + a crash with a core dump:
> 
> accel-tcg-x86_64 module is missing, install the package or config the library path correctly.
> **
> ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed: (ops != NULL)
> Bail out! ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed: (ops != NULL)
> [1]    5740 IOT instruction (core dumped)  ./qemu-system-x86_64 -accel tcg
> 
> I was digging a little bit more in order to move this responsibility to
> module.c but there isn't enough information there to safely exit() in
> all situations that a module may be loaded. As Gerd mentioned, more work
> is needed in order to achieve that.
> 
> However, it's not nice to have a crash due to an optional module missing.
> It's specially confusing because TCG has always been native. Considering
> also that we're already in hard freeze for 6.1, I thought to have this
> simpler check instead.
> 
> What do you think if we have something like:
> 
> /* FIXME: this isn't the right place to handle a missing module and
>    must be reverted when the module refactoring is completely done */
> #ifdef CONFIG_MODULES
> if (ops == NULL) {
>     exit(1);
> }
> #else
> g_assert(ops != NULL);
> #endif
> 
> Regards!


For the normal builds (without modular tcg), this issue does not appear right?
So maybe there is no pressure to change anything for 6.1, and we can work on the right solution on master?

Not sure how we consider this feature for 6.1, I guess it is still not a supported option,
(is there any CI for this? Probably not right?),

so I would consider building modular tcg in 6.1 as "experimental", and we can proceed to do the right thing on master?

Thanks,

Claudio

> 
>>
>>>      if (ops->ops_init) {
>>>          ops->ops_init(ops);
>>>      }
>>> diff --git a/util/module.c b/util/module.c
>>> index 6bb4ad915a..268a8563fd 100644
>>> --- a/util/module.c
>>> +++ b/util/module.c
>>> @@ -206,13 +206,10 @@ static int module_load_file(const char *fname, bool mayfail, bool export_symbols
>>>  out:
>>>      return ret;
>>>  }
>>> -#endif
>>>  
>>>  bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
>>>  {
>>>      bool success = false;
>>> -
>>> -#ifdef CONFIG_MODULES
>>>      char *fname = NULL;
>>>  #ifdef CONFIG_MODULE_UPGRADES
>>>      char *version_dir;
>>> @@ -300,6 +297,9 @@ bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
>>>  
>>>      if (!success) {
>>>          g_hash_table_remove(loaded_modules, module_name);
>>> +        fprintf(stderr, "%s module is missing, install the "
>>> +                        "package or config the library path "
>>> +                        "correctly.\n", module_name);
>>>          g_free(module_name);
>>>      }
>>>  
>>> @@ -307,12 +307,9 @@ bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
>>>          g_free(dirs[i]);
>>>      }
>>>  
>>> -#endif
>>>      return success;
>>>  }
>>>  
>>> -#ifdef CONFIG_MODULES
>>> -
>>>  static bool module_loaded_qom_all;
>>>  
>>>  void module_load_qom_one(const char *type)
>>> @@ -384,4 +381,9 @@ void qemu_load_module_for_opts(const char *group) {}
>>>  void module_load_qom_one(const char *type) {}
>>>  void module_load_qom_all(void) {}
>>>  
>>> +bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
>>> +{
>>> +    return false;
>>> +}
>>> +
>>>  #endif
>>>
>>
Philippe Mathieu-Daudé July 23, 2021, 2:14 p.m. UTC | #12
On 7/23/21 4:02 PM, Claudio Fontana wrote:
> On 7/23/21 3:50 PM, Jose R. Ziviani wrote:
>> On Fri, Jul 23, 2021 at 11:41:19AM +0200, Claudio Fontana wrote:
>>> On 7/23/21 12:09 AM, Jose R. Ziviani wrote:
>>>> When a module is not found, specially accelerators, QEMU displays
>>>> a error message that not easy to understand[1]. This patch improves
>>>> the readability by offering a user-friendly message[2].
>>>>
>>>> This patch also moves the accelerator ops check to runtine (instead
>>>> of the original g_assert) because it works better with dynamic
>>>> modules.
>>>>
>>>> [1] qemu-system-x86_64 -accel tcg
>>>> ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed:
>>>> (ops != NULL)
>>>> Bail out! ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces:
>>>> assertion failed: (ops != NULL)
>>>>     31964 IOT instruction (core dumped)  ./qemu-system-x86_64 ...
>>>>
>>>> [2] qemu-system-x86_64 -accel tcg
>>>> accel-tcg-x86_64 module is missing, install the package or config the library path correctly.
>>>>
>>>> Signed-off-by: Jose R. Ziviani <jziviani@suse.de>
>>>> ---
>>>>  accel/accel-softmmu.c |  5 ++++-
>>>>  util/module.c         | 14 ++++++++------
>>>>  2 files changed, 12 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/accel/accel-softmmu.c b/accel/accel-softmmu.c
>>>> index 67276e4f52..52449ac2d0 100644
>>>> --- a/accel/accel-softmmu.c
>>>> +++ b/accel/accel-softmmu.c
>>>> @@ -79,7 +79,10 @@ void accel_init_ops_interfaces(AccelClass *ac)
>>>>       * all accelerators need to define ops, providing at least a mandatory
>>>>       * non-NULL create_vcpu_thread operation.
>>>>       */
>>>> -    g_assert(ops != NULL);
>>>> +    if (ops == NULL) {
>>>> +        exit(1);
>>>> +    }
>>>> +
>>>
>>>
>>> Ah, again, why?
>>> This change looks wrong to me, 
>>>
>>> the ops code should be present when ops interfaces are initialized:
>>> it should be a code level assertion, as it has to do with the proper order of initializations in QEMU,
>>>
>>> why would we want to do anything else but to assert here?
>>>
>>> Am I blind to something obvious?
>>
>> Hello!
>>
>> Thank you for reviewing it!
>>
>> The problem is that if your TCG module is not installed and you start
>> QEMU like:
>>
>> ./qemu-system-x86_64 -accel tcg
>>
>> You'll get the error message + a crash with a core dump:
>>
>> accel-tcg-x86_64 module is missing, install the package or config the library path correctly.
>> **
>> ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed: (ops != NULL)
>> Bail out! ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed: (ops != NULL)
>> [1]    5740 IOT instruction (core dumped)  ./qemu-system-x86_64 -accel tcg
>>
>> I was digging a little bit more in order to move this responsibility to
>> module.c but there isn't enough information there to safely exit() in
>> all situations that a module may be loaded. As Gerd mentioned, more work
>> is needed in order to achieve that.
>>
>> However, it's not nice to have a crash due to an optional module missing.
>> It's specially confusing because TCG has always been native. Considering
>> also that we're already in hard freeze for 6.1, I thought to have this
>> simpler check instead.
>>
>> What do you think if we have something like:
>>
>> /* FIXME: this isn't the right place to handle a missing module and
>>    must be reverted when the module refactoring is completely done */
>> #ifdef CONFIG_MODULES
>> if (ops == NULL) {
>>     exit(1);
>> }
>> #else
>> g_assert(ops != NULL);
>> #endif
>>
>> Regards!
> 
> 
> For the normal builds (without modular tcg), this issue does not appear right?
> So maybe there is no pressure to change anything for 6.1, and we can work on the right solution on master?
> 
> Not sure how we consider this feature for 6.1, I guess it is still not a supported option,
> (is there any CI for this? Probably not right?),
> 
> so I would consider building modular tcg in 6.1 as "experimental", and we can proceed to do the right thing on master?

+1 as I don't see this feature ready.
Jose R. Ziviani July 23, 2021, 2:36 p.m. UTC | #13
On Fri, Jul 23, 2021 at 04:02:26PM +0200, Claudio Fontana wrote:
> On 7/23/21 3:50 PM, Jose R. Ziviani wrote:
> > On Fri, Jul 23, 2021 at 11:41:19AM +0200, Claudio Fontana wrote:
> >> On 7/23/21 12:09 AM, Jose R. Ziviani wrote:
> >>> When a module is not found, specially accelerators, QEMU displays
> >>> a error message that not easy to understand[1]. This patch improves
> >>> the readability by offering a user-friendly message[2].
> >>>
> >>> This patch also moves the accelerator ops check to runtine (instead
> >>> of the original g_assert) because it works better with dynamic
> >>> modules.
> >>>
> >>> [1] qemu-system-x86_64 -accel tcg
> >>> ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed:
> >>> (ops != NULL)
> >>> Bail out! ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces:
> >>> assertion failed: (ops != NULL)
> >>>     31964 IOT instruction (core dumped)  ./qemu-system-x86_64 ...
> >>>
> >>> [2] qemu-system-x86_64 -accel tcg
> >>> accel-tcg-x86_64 module is missing, install the package or config the library path correctly.
> >>>
> >>> Signed-off-by: Jose R. Ziviani <jziviani@suse.de>
> >>> ---
> >>>  accel/accel-softmmu.c |  5 ++++-
> >>>  util/module.c         | 14 ++++++++------
> >>>  2 files changed, 12 insertions(+), 7 deletions(-)
> >>>
> >>> diff --git a/accel/accel-softmmu.c b/accel/accel-softmmu.c
> >>> index 67276e4f52..52449ac2d0 100644
> >>> --- a/accel/accel-softmmu.c
> >>> +++ b/accel/accel-softmmu.c
> >>> @@ -79,7 +79,10 @@ void accel_init_ops_interfaces(AccelClass *ac)
> >>>       * all accelerators need to define ops, providing at least a mandatory
> >>>       * non-NULL create_vcpu_thread operation.
> >>>       */
> >>> -    g_assert(ops != NULL);
> >>> +    if (ops == NULL) {
> >>> +        exit(1);
> >>> +    }
> >>> +
> >>
> >>
> >> Ah, again, why?
> >> This change looks wrong to me, 
> >>
> >> the ops code should be present when ops interfaces are initialized:
> >> it should be a code level assertion, as it has to do with the proper order of initializations in QEMU,
> >>
> >> why would we want to do anything else but to assert here?
> >>
> >> Am I blind to something obvious?
> > 
> > Hello!
> > 
> > Thank you for reviewing it!
> > 
> > The problem is that if your TCG module is not installed and you start
> > QEMU like:
> > 
> > ./qemu-system-x86_64 -accel tcg
> > 
> > You'll get the error message + a crash with a core dump:
> > 
> > accel-tcg-x86_64 module is missing, install the package or config the library path correctly.
> > **
> > ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed: (ops != NULL)
> > Bail out! ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed: (ops != NULL)
> > [1]    5740 IOT instruction (core dumped)  ./qemu-system-x86_64 -accel tcg
> > 
> > I was digging a little bit more in order to move this responsibility to
> > module.c but there isn't enough information there to safely exit() in
> > all situations that a module may be loaded. As Gerd mentioned, more work
> > is needed in order to achieve that.
> > 
> > However, it's not nice to have a crash due to an optional module missing.
> > It's specially confusing because TCG has always been native. Considering
> > also that we're already in hard freeze for 6.1, I thought to have this
> > simpler check instead.
> > 
> > What do you think if we have something like:
> > 
> > /* FIXME: this isn't the right place to handle a missing module and
> >    must be reverted when the module refactoring is completely done */
> > #ifdef CONFIG_MODULES
> > if (ops == NULL) {
> >     exit(1);
> > }
> > #else
> > g_assert(ops != NULL);
> > #endif
> > 
> > Regards!
> 
> 
> For the normal builds (without modular tcg), this issue does not appear right?

Yes, but OpenSUSE already builds with --enable-modules, we've already been shipping
several modules as optional RPMs, like qemu-hw-display-virtio-gpu for example. I sent
a patch some weeks ago to add "--enable-tcg-builtin" in the build system but there're
more work required in that area as well.

> So maybe there is no pressure to change anything for 6.1, and we can work on the right solution on master?
> 
> Not sure how we consider this feature for 6.1, I guess it is still not a supported option,
> (is there any CI for this? Probably not right?),
> 
> so I would consider building modular tcg in 6.1 as "experimental", and we can proceed to do the right thing on master?

For OpenSUSE Tumbleweed, when we release QEMU 6.1, I can add my patch to
"--enable-tcg-builtin" for downstream only. I'm fine with it too.

Thank you!!!

> 
> Thanks,
> 
> Claudio
> 
> > 
> >>
> >>>      if (ops->ops_init) {
> >>>          ops->ops_init(ops);
> >>>      }
> >>> diff --git a/util/module.c b/util/module.c
> >>> index 6bb4ad915a..268a8563fd 100644
> >>> --- a/util/module.c
> >>> +++ b/util/module.c
> >>> @@ -206,13 +206,10 @@ static int module_load_file(const char *fname, bool mayfail, bool export_symbols
> >>>  out:
> >>>      return ret;
> >>>  }
> >>> -#endif
> >>>  
> >>>  bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
> >>>  {
> >>>      bool success = false;
> >>> -
> >>> -#ifdef CONFIG_MODULES
> >>>      char *fname = NULL;
> >>>  #ifdef CONFIG_MODULE_UPGRADES
> >>>      char *version_dir;
> >>> @@ -300,6 +297,9 @@ bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
> >>>  
> >>>      if (!success) {
> >>>          g_hash_table_remove(loaded_modules, module_name);
> >>> +        fprintf(stderr, "%s module is missing, install the "
> >>> +                        "package or config the library path "
> >>> +                        "correctly.\n", module_name);
> >>>          g_free(module_name);
> >>>      }
> >>>  
> >>> @@ -307,12 +307,9 @@ bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
> >>>          g_free(dirs[i]);
> >>>      }
> >>>  
> >>> -#endif
> >>>      return success;
> >>>  }
> >>>  
> >>> -#ifdef CONFIG_MODULES
> >>> -
> >>>  static bool module_loaded_qom_all;
> >>>  
> >>>  void module_load_qom_one(const char *type)
> >>> @@ -384,4 +381,9 @@ void qemu_load_module_for_opts(const char *group) {}
> >>>  void module_load_qom_one(const char *type) {}
> >>>  void module_load_qom_all(void) {}
> >>>  
> >>> +bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
> >>> +{
> >>> +    return false;
> >>> +}
> >>> +
> >>>  #endif
> >>>
> >>
>
Claudio Fontana July 23, 2021, 3:27 p.m. UTC | #14
On 7/23/21 4:36 PM, Jose R. Ziviani wrote:
> On Fri, Jul 23, 2021 at 04:02:26PM +0200, Claudio Fontana wrote:
>> On 7/23/21 3:50 PM, Jose R. Ziviani wrote:
>>> On Fri, Jul 23, 2021 at 11:41:19AM +0200, Claudio Fontana wrote:
>>>> On 7/23/21 12:09 AM, Jose R. Ziviani wrote:
>>>>> When a module is not found, specially accelerators, QEMU displays
>>>>> a error message that not easy to understand[1]. This patch improves
>>>>> the readability by offering a user-friendly message[2].
>>>>>
>>>>> This patch also moves the accelerator ops check to runtine (instead
>>>>> of the original g_assert) because it works better with dynamic
>>>>> modules.
>>>>>
>>>>> [1] qemu-system-x86_64 -accel tcg
>>>>> ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed:
>>>>> (ops != NULL)
>>>>> Bail out! ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces:
>>>>> assertion failed: (ops != NULL)
>>>>>     31964 IOT instruction (core dumped)  ./qemu-system-x86_64 ...
>>>>>
>>>>> [2] qemu-system-x86_64 -accel tcg
>>>>> accel-tcg-x86_64 module is missing, install the package or config the library path correctly.
>>>>>
>>>>> Signed-off-by: Jose R. Ziviani <jziviani@suse.de>
>>>>> ---
>>>>>  accel/accel-softmmu.c |  5 ++++-
>>>>>  util/module.c         | 14 ++++++++------
>>>>>  2 files changed, 12 insertions(+), 7 deletions(-)
>>>>>
>>>>> diff --git a/accel/accel-softmmu.c b/accel/accel-softmmu.c
>>>>> index 67276e4f52..52449ac2d0 100644
>>>>> --- a/accel/accel-softmmu.c
>>>>> +++ b/accel/accel-softmmu.c
>>>>> @@ -79,7 +79,10 @@ void accel_init_ops_interfaces(AccelClass *ac)
>>>>>       * all accelerators need to define ops, providing at least a mandatory
>>>>>       * non-NULL create_vcpu_thread operation.
>>>>>       */
>>>>> -    g_assert(ops != NULL);
>>>>> +    if (ops == NULL) {
>>>>> +        exit(1);
>>>>> +    }
>>>>> +
>>>>
>>>>
>>>> Ah, again, why?
>>>> This change looks wrong to me, 
>>>>
>>>> the ops code should be present when ops interfaces are initialized:
>>>> it should be a code level assertion, as it has to do with the proper order of initializations in QEMU,
>>>>
>>>> why would we want to do anything else but to assert here?
>>>>
>>>> Am I blind to something obvious?
>>>
>>> Hello!
>>>
>>> Thank you for reviewing it!
>>>
>>> The problem is that if your TCG module is not installed and you start
>>> QEMU like:
>>>
>>> ./qemu-system-x86_64 -accel tcg
>>>
>>> You'll get the error message + a crash with a core dump:
>>>
>>> accel-tcg-x86_64 module is missing, install the package or config the library path correctly.
>>> **
>>> ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed: (ops != NULL)
>>> Bail out! ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed: (ops != NULL)
>>> [1]    5740 IOT instruction (core dumped)  ./qemu-system-x86_64 -accel tcg
>>>
>>> I was digging a little bit more in order to move this responsibility to
>>> module.c but there isn't enough information there to safely exit() in
>>> all situations that a module may be loaded. As Gerd mentioned, more work
>>> is needed in order to achieve that.
>>>
>>> However, it's not nice to have a crash due to an optional module missing.
>>> It's specially confusing because TCG has always been native. Considering
>>> also that we're already in hard freeze for 6.1, I thought to have this
>>> simpler check instead.
>>>
>>> What do you think if we have something like:
>>>
>>> /* FIXME: this isn't the right place to handle a missing module and
>>>    must be reverted when the module refactoring is completely done */
>>> #ifdef CONFIG_MODULES
>>> if (ops == NULL) {
>>>     exit(1);
>>> }
>>> #else
>>> g_assert(ops != NULL);
>>> #endif
>>>
>>> Regards!
>>
>>
>> For the normal builds (without modular tcg), this issue does not appear right?
> 
> Yes, but OpenSUSE already builds with --enable-modules, we've already been shipping
> several modules as optional RPMs, like qemu-hw-display-virtio-gpu for example. I sent
> a patch some weeks ago to add "--enable-tcg-builtin" in the build system but there're
> more work required in that area as well.
> 
>> So maybe there is no pressure to change anything for 6.1, and we can work on the right solution on master?
>>
>> Not sure how we consider this feature for 6.1, I guess it is still not a supported option,
>> (is there any CI for this? Probably not right?),
>>
>> so I would consider building modular tcg in 6.1 as "experimental", and we can proceed to do the right thing on master?
> 
> For OpenSUSE Tumbleweed, when we release QEMU 6.1, I can add my patch to
> "--enable-tcg-builtin" for downstream only. I'm fine with it too.

Hi Jose,

indeed if we need to do something downstream it's fine,
but lets keep the discussion on this list focused on what is best for upstream.

Ciao, thanks :-)

Claudio

> 
> Thank you!!!
> 
>>
>> Thanks,
>>
>> Claudio
>>
>>>
>>>>
>>>>>      if (ops->ops_init) {
>>>>>          ops->ops_init(ops);
>>>>>      }
>>>>> diff --git a/util/module.c b/util/module.c
>>>>> index 6bb4ad915a..268a8563fd 100644
>>>>> --- a/util/module.c
>>>>> +++ b/util/module.c
>>>>> @@ -206,13 +206,10 @@ static int module_load_file(const char *fname, bool mayfail, bool export_symbols
>>>>>  out:
>>>>>      return ret;
>>>>>  }
>>>>> -#endif
>>>>>  
>>>>>  bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
>>>>>  {
>>>>>      bool success = false;
>>>>> -
>>>>> -#ifdef CONFIG_MODULES
>>>>>      char *fname = NULL;
>>>>>  #ifdef CONFIG_MODULE_UPGRADES
>>>>>      char *version_dir;
>>>>> @@ -300,6 +297,9 @@ bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
>>>>>  
>>>>>      if (!success) {
>>>>>          g_hash_table_remove(loaded_modules, module_name);
>>>>> +        fprintf(stderr, "%s module is missing, install the "
>>>>> +                        "package or config the library path "
>>>>> +                        "correctly.\n", module_name);
>>>>>          g_free(module_name);
>>>>>      }
>>>>>  
>>>>> @@ -307,12 +307,9 @@ bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
>>>>>          g_free(dirs[i]);
>>>>>      }
>>>>>  
>>>>> -#endif
>>>>>      return success;
>>>>>  }
>>>>>  
>>>>> -#ifdef CONFIG_MODULES
>>>>> -
>>>>>  static bool module_loaded_qom_all;
>>>>>  
>>>>>  void module_load_qom_one(const char *type)
>>>>> @@ -384,4 +381,9 @@ void qemu_load_module_for_opts(const char *group) {}
>>>>>  void module_load_qom_one(const char *type) {}
>>>>>  void module_load_qom_all(void) {}
>>>>>  
>>>>> +bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
>>>>> +{
>>>>> +    return false;
>>>>> +}
>>>>> +
>>>>>  #endif
>>>>>
>>>>
>>
Jose R. Ziviani July 23, 2021, 3:46 p.m. UTC | #15
On Fri, Jul 23, 2021 at 05:27:25PM +0200, Claudio Fontana wrote:
> On 7/23/21 4:36 PM, Jose R. Ziviani wrote:
> > On Fri, Jul 23, 2021 at 04:02:26PM +0200, Claudio Fontana wrote:
> >> On 7/23/21 3:50 PM, Jose R. Ziviani wrote:
> >>> On Fri, Jul 23, 2021 at 11:41:19AM +0200, Claudio Fontana wrote:
> >>>> On 7/23/21 12:09 AM, Jose R. Ziviani wrote:
> >>>>> When a module is not found, specially accelerators, QEMU displays
> >>>>> a error message that not easy to understand[1]. This patch improves
> >>>>> the readability by offering a user-friendly message[2].
> >>>>>
> >>>>> This patch also moves the accelerator ops check to runtine (instead
> >>>>> of the original g_assert) because it works better with dynamic
> >>>>> modules.
> >>>>>
> >>>>> [1] qemu-system-x86_64 -accel tcg
> >>>>> ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed:
> >>>>> (ops != NULL)
> >>>>> Bail out! ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces:
> >>>>> assertion failed: (ops != NULL)
> >>>>>     31964 IOT instruction (core dumped)  ./qemu-system-x86_64 ...
> >>>>>
> >>>>> [2] qemu-system-x86_64 -accel tcg
> >>>>> accel-tcg-x86_64 module is missing, install the package or config the library path correctly.
> >>>>>
> >>>>> Signed-off-by: Jose R. Ziviani <jziviani@suse.de>
> >>>>> ---
> >>>>>  accel/accel-softmmu.c |  5 ++++-
> >>>>>  util/module.c         | 14 ++++++++------
> >>>>>  2 files changed, 12 insertions(+), 7 deletions(-)
> >>>>>
> >>>>> diff --git a/accel/accel-softmmu.c b/accel/accel-softmmu.c
> >>>>> index 67276e4f52..52449ac2d0 100644
> >>>>> --- a/accel/accel-softmmu.c
> >>>>> +++ b/accel/accel-softmmu.c
> >>>>> @@ -79,7 +79,10 @@ void accel_init_ops_interfaces(AccelClass *ac)
> >>>>>       * all accelerators need to define ops, providing at least a mandatory
> >>>>>       * non-NULL create_vcpu_thread operation.
> >>>>>       */
> >>>>> -    g_assert(ops != NULL);
> >>>>> +    if (ops == NULL) {
> >>>>> +        exit(1);
> >>>>> +    }
> >>>>> +
> >>>>
> >>>>
> >>>> Ah, again, why?
> >>>> This change looks wrong to me, 
> >>>>
> >>>> the ops code should be present when ops interfaces are initialized:
> >>>> it should be a code level assertion, as it has to do with the proper order of initializations in QEMU,
> >>>>
> >>>> why would we want to do anything else but to assert here?
> >>>>
> >>>> Am I blind to something obvious?
> >>>
> >>> Hello!
> >>>
> >>> Thank you for reviewing it!
> >>>
> >>> The problem is that if your TCG module is not installed and you start
> >>> QEMU like:
> >>>
> >>> ./qemu-system-x86_64 -accel tcg
> >>>
> >>> You'll get the error message + a crash with a core dump:
> >>>
> >>> accel-tcg-x86_64 module is missing, install the package or config the library path correctly.
> >>> **
> >>> ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed: (ops != NULL)
> >>> Bail out! ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed: (ops != NULL)
> >>> [1]    5740 IOT instruction (core dumped)  ./qemu-system-x86_64 -accel tcg
> >>>
> >>> I was digging a little bit more in order to move this responsibility to
> >>> module.c but there isn't enough information there to safely exit() in
> >>> all situations that a module may be loaded. As Gerd mentioned, more work
> >>> is needed in order to achieve that.
> >>>
> >>> However, it's not nice to have a crash due to an optional module missing.
> >>> It's specially confusing because TCG has always been native. Considering
> >>> also that we're already in hard freeze for 6.1, I thought to have this
> >>> simpler check instead.
> >>>
> >>> What do you think if we have something like:
> >>>
> >>> /* FIXME: this isn't the right place to handle a missing module and
> >>>    must be reverted when the module refactoring is completely done */
> >>> #ifdef CONFIG_MODULES
> >>> if (ops == NULL) {
> >>>     exit(1);
> >>> }
> >>> #else
> >>> g_assert(ops != NULL);
> >>> #endif
> >>>
> >>> Regards!
> >>
> >>
> >> For the normal builds (without modular tcg), this issue does not appear right?
> > 
> > Yes, but OpenSUSE already builds with --enable-modules, we've already been shipping
> > several modules as optional RPMs, like qemu-hw-display-virtio-gpu for example. I sent
> > a patch some weeks ago to add "--enable-tcg-builtin" in the build system but there're
> > more work required in that area as well.
> > 
> >> So maybe there is no pressure to change anything for 6.1, and we can work on the right solution on master?
> >>
> >> Not sure how we consider this feature for 6.1, I guess it is still not a supported option,
> >> (is there any CI for this? Probably not right?),
> >>
> >> so I would consider building modular tcg in 6.1 as "experimental", and we can proceed to do the right thing on master?
> > 
> > For OpenSUSE Tumbleweed, when we release QEMU 6.1, I can add my patch to
> > "--enable-tcg-builtin" for downstream only. I'm fine with it too.
> 
> Hi Jose,
> 
> indeed if we need to do something downstream it's fine,
> but lets keep the discussion on this list focused on what is best for upstream.

Absolutely! I'm sorry. I just wanted to answer that --enable-modules is
already used. Didn't mean cause any disturbance, sorry.

Thank you!!

> 
> Ciao, thanks :-)
> 
> Claudio
> 
> > 
> > Thank you!!!
> > 
> >>
> >> Thanks,
> >>
> >> Claudio
> >>
> >>>
> >>>>
> >>>>>      if (ops->ops_init) {
> >>>>>          ops->ops_init(ops);
> >>>>>      }
> >>>>> diff --git a/util/module.c b/util/module.c
> >>>>> index 6bb4ad915a..268a8563fd 100644
> >>>>> --- a/util/module.c
> >>>>> +++ b/util/module.c
> >>>>> @@ -206,13 +206,10 @@ static int module_load_file(const char *fname, bool mayfail, bool export_symbols
> >>>>>  out:
> >>>>>      return ret;
> >>>>>  }
> >>>>> -#endif
> >>>>>  
> >>>>>  bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
> >>>>>  {
> >>>>>      bool success = false;
> >>>>> -
> >>>>> -#ifdef CONFIG_MODULES
> >>>>>      char *fname = NULL;
> >>>>>  #ifdef CONFIG_MODULE_UPGRADES
> >>>>>      char *version_dir;
> >>>>> @@ -300,6 +297,9 @@ bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
> >>>>>  
> >>>>>      if (!success) {
> >>>>>          g_hash_table_remove(loaded_modules, module_name);
> >>>>> +        fprintf(stderr, "%s module is missing, install the "
> >>>>> +                        "package or config the library path "
> >>>>> +                        "correctly.\n", module_name);
> >>>>>          g_free(module_name);
> >>>>>      }
> >>>>>  
> >>>>> @@ -307,12 +307,9 @@ bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
> >>>>>          g_free(dirs[i]);
> >>>>>      }
> >>>>>  
> >>>>> -#endif
> >>>>>      return success;
> >>>>>  }
> >>>>>  
> >>>>> -#ifdef CONFIG_MODULES
> >>>>> -
> >>>>>  static bool module_loaded_qom_all;
> >>>>>  
> >>>>>  void module_load_qom_one(const char *type)
> >>>>> @@ -384,4 +381,9 @@ void qemu_load_module_for_opts(const char *group) {}
> >>>>>  void module_load_qom_one(const char *type) {}
> >>>>>  void module_load_qom_all(void) {}
> >>>>>  
> >>>>> +bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
> >>>>> +{
> >>>>> +    return false;
> >>>>> +}
> >>>>> +
> >>>>>  #endif
> >>>>>
> >>>>
> >>
>
diff mbox series

Patch

diff --git a/accel/accel-softmmu.c b/accel/accel-softmmu.c
index 67276e4f52..52449ac2d0 100644
--- a/accel/accel-softmmu.c
+++ b/accel/accel-softmmu.c
@@ -79,7 +79,10 @@  void accel_init_ops_interfaces(AccelClass *ac)
      * all accelerators need to define ops, providing at least a mandatory
      * non-NULL create_vcpu_thread operation.
      */
-    g_assert(ops != NULL);
+    if (ops == NULL) {
+        exit(1);
+    }
+
     if (ops->ops_init) {
         ops->ops_init(ops);
     }
diff --git a/util/module.c b/util/module.c
index 6bb4ad915a..268a8563fd 100644
--- a/util/module.c
+++ b/util/module.c
@@ -206,13 +206,10 @@  static int module_load_file(const char *fname, bool mayfail, bool export_symbols
 out:
     return ret;
 }
-#endif
 
 bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
 {
     bool success = false;
-
-#ifdef CONFIG_MODULES
     char *fname = NULL;
 #ifdef CONFIG_MODULE_UPGRADES
     char *version_dir;
@@ -300,6 +297,9 @@  bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
 
     if (!success) {
         g_hash_table_remove(loaded_modules, module_name);
+        fprintf(stderr, "%s module is missing, install the "
+                        "package or config the library path "
+                        "correctly.\n", module_name);
         g_free(module_name);
     }
 
@@ -307,12 +307,9 @@  bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
         g_free(dirs[i]);
     }
 
-#endif
     return success;
 }
 
-#ifdef CONFIG_MODULES
-
 static bool module_loaded_qom_all;
 
 void module_load_qom_one(const char *type)
@@ -384,4 +381,9 @@  void qemu_load_module_for_opts(const char *group) {}
 void module_load_qom_one(const char *type) {}
 void module_load_qom_all(void) {}
 
+bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
+{
+    return false;
+}
+
 #endif