diff mbox series

[2/5] machine.py: add default pseries params in machine.py

Message ID 20220516165321.872394-3-danielhb413@gmail.com (mailing list archive)
State New, archived
Headers show
Series machine.py fix for ppc64 tests + avocado changes | expand

Commit Message

Daniel Henrique Barboza May 16, 2022, 4:53 p.m. UTC
pSeries guests set a handful of machine capabilities on by default, all
of them related to security mitigations, that aren't always available in
the host.

This means that, as is today, running avocado in a Power9 server without
the proper firmware support, and with --disable-tcg, this error will
occur:

 (1/1) tests/avocado/info_usernet.py:InfoUsernet.test_hostfwd: ERROR: ConnectError:
Failed to establish session: EOFError\n  Exit code: 1\n  (...)
(...)
        Command: ./qemu-system-ppc64 -display none -vga none (...)
        Output: qemu-system-ppc64: warning: netdev vnet has no peer
qemu-system-ppc64: Requested safe cache capability level not supported by KVM
Try appending -machine cap-cfpc=broken

info_usernet.py happens to trigger this error first, but all tests would
fail in this configuration because the host does not support the default
'cap-cfpc' capability.

A similar situation was already fixed a couple of years ago by Greg Kurz
(commit 63d57c8f91d0) but it was focused on TCG warnings for these same
capabilities and running C qtests. This commit ended up preventing the
problem we're facing with avocado when running qtests with KVM support.

This patch does a similar approach by amending machine.py to disable
these security capabilities in case we're running a pseries guest. The
change is made in the _launch() callback to be sure that we're already
commited into launching the guest. It's also worth noticing that we're
relying on self._machine being set accordingly (i.e. via tag:machine),
which is currently the case for all ppc64 related avocado tests.

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
---
 python/qemu/machine/machine.py | 13 +++++++++++++
 1 file changed, 13 insertions(+)

Comments

John Snow May 19, 2022, 11:18 p.m. UTC | #1
On Mon, May 16, 2022, 12:53 PM Daniel Henrique Barboza <
danielhb413@gmail.com> wrote:

> pSeries guests set a handful of machine capabilities on by default, all
> of them related to security mitigations, that aren't always available in
> the host.
>
> This means that, as is today, running avocado in a Power9 server without
> the proper firmware support, and with --disable-tcg, this error will
> occur:
>
>  (1/1) tests/avocado/info_usernet.py:InfoUsernet.test_hostfwd: ERROR:
> ConnectError:
> Failed to establish session: EOFError\n  Exit code: 1\n  (...)
> (...)
>         Command: ./qemu-system-ppc64 -display none -vga none (...)
>         Output: qemu-system-ppc64: warning: netdev vnet has no peer
> qemu-system-ppc64: Requested safe cache capability level not supported by
> KVM
> Try appending -machine cap-cfpc=broken
>
> info_usernet.py happens to trigger this error first, but all tests would
> fail in this configuration because the host does not support the default
> 'cap-cfpc' capability.
>
> A similar situation was already fixed a couple of years ago by Greg Kurz
> (commit 63d57c8f91d0) but it was focused on TCG warnings for these same
> capabilities and running C qtests. This commit ended up preventing the
> problem we're facing with avocado when running qtests with KVM support.
>
> This patch does a similar approach by amending machine.py to disable
> these security capabilities in case we're running a pseries guest. The
> change is made in the _launch() callback to be sure that we're already
> commited into launching the guest. It's also worth noticing that we're
> relying on self._machine being set accordingly (i.e. via tag:machine),
> which is currently the case for all ppc64 related avocado tests.
>
> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
> ---
>  python/qemu/machine/machine.py | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
>
> diff --git a/python/qemu/machine/machine.py
> b/python/qemu/machine/machine.py
> index 07ac5a710b..12e5e37bff 100644
> --- a/python/qemu/machine/machine.py
> +++ b/python/qemu/machine/machine.py
> @@ -51,6 +51,11 @@
>
>
>  LOG = logging.getLogger(__name__)
> +PSERIES_DEFAULT_CAPABILITIES = ("cap-cfpc=broken,"
> +                                "cap-sbbc=broken,"
> +                                "cap-ibs=broken,"
> +                                "cap-ccf-assist=off,"
> +                                "cap-fwnmi=off")
>
>
>  class QEMUMachineError(Exception):
> @@ -447,6 +452,14 @@ def _launch(self) -> None:
>          """
>          Launch the VM and establish a QMP connection
>          """
> +
> +        # pseries needs extra machine options to disable Spectre/Meltdown
> +        # KVM related capabilities that might not be available in the
> +        # host.
> +        if "qemu-system-ppc64" in self._binary:
> +            if self._machine is None or "pseries" in self._machine:
> +                self._args.extend(['-machine',
> PSERIES_DEFAULT_CAPABILITIES])
> +
>          self._pre_launch()
>          LOG.debug('VM launch command: %r', ' '.join(self._qemu_full_args))
>
> --
> 2.32.0
>

Hm, okay.

I have plans to try and factor the machine appliance out and into an
upstream package in the near future, so I want to avoid more hardcoding of
defaults.

Does avocado have a subclass of QEMUMachine where it might be more
appropriate to stick this bandaid? Can we make one?

(I don't think iotests runs into this problem because we always use
machine:none there, I think. VM tests might have a similar problem though,
and then it'd be reasonable to want the bandaid here in machine.py ...
well, boo. okay.)

My verdict is that it's a bandaid, but I'll accept it if the avocado folks
agree to it and I'll sort it out later when I do my rewrite.

I don't think I have access to a power9 machine to test this with either,
so I might want a tested-by from someone who does.

--js

>
Matheus K. Ferst May 23, 2022, 7:50 p.m. UTC | #2
On 19/05/2022 20:18, John Snow wrote:
> On Mon, May 16, 2022, 12:53 PM Daniel Henrique Barboza 
> <danielhb413@gmail.com <mailto:danielhb413@gmail.com>> wrote:
> 
>     pSeries guests set a handful of machine capabilities on by default, all
>     of them related to security mitigations, that aren't always available in
>     the host.
> 
>     This means that, as is today, running avocado in a Power9 server without
>     the proper firmware support, and with --disable-tcg, this error will
>     occur:
> 
>       (1/1) tests/avocado/info_usernet.py:InfoUsernet.test_hostfwd:
>     ERROR: ConnectError:
>     Failed to establish session: EOFError\n  Exit code: 1\n  (...)
>     (...)
>              Command: ./qemu-system-ppc64 -display none -vga none (...)
>              Output: qemu-system-ppc64: warning: netdev vnet has no peer
>     qemu-system-ppc64: Requested safe cache capability level not
>     supported by KVM
>     Try appending -machine cap-cfpc=broken
> 
>     info_usernet.py happens to trigger this error first, but all tests would
>     fail in this configuration because the host does not support the default
>     'cap-cfpc' capability.
> 
>     A similar situation was already fixed a couple of years ago by Greg Kurz
>     (commit 63d57c8f91d0) but it was focused on TCG warnings for these same
>     capabilities and running C qtests. This commit ended up preventing the
>     problem we're facing with avocado when running qtests with KVM support.
> 
>     This patch does a similar approach by amending machine.py to disable
>     these security capabilities in case we're running a pseries guest. The
>     change is made in the _launch() callback to be sure that we're already
>     commited into launching the guest. It's also worth noticing that we're
>     relying on self._machine being set accordingly (i.e. via tag:machine),
>     which is currently the case for all ppc64 related avocado tests.
> 
>     Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com
>     <mailto:danielhb413@gmail.com>>
>     ---
>       python/qemu/machine/machine.py | 13 +++++++++++++
>       1 file changed, 13 insertions(+)
> 
>     diff --git a/python/qemu/machine/machine.py
>     b/python/qemu/machine/machine.py
>     index 07ac5a710b..12e5e37bff 100644
>     --- a/python/qemu/machine/machine.py
>     +++ b/python/qemu/machine/machine.py
>     @@ -51,6 +51,11 @@
> 
> 
>       LOG = logging.getLogger(__name__)
>     +PSERIES_DEFAULT_CAPABILITIES = ("cap-cfpc=broken,"
>     +                                "cap-sbbc=broken,"
>     +                                "cap-ibs=broken,"
>     +                                "cap-ccf-assist=off,"
>     +                                "cap-fwnmi=off")
> 
> 
>       class QEMUMachineError(Exception):
>     @@ -447,6 +452,14 @@ def _launch(self) -> None:
>               """
>               Launch the VM and establish a QMP connection
>               """
>     +
>     +        # pseries needs extra machine options to disable
>     Spectre/Meltdown
>     +        # KVM related capabilities that might not be available in the
>     +        # host.
>     +        if "qemu-system-ppc64" in self._binary:
>     +            if self._machine is None or "pseries" in self._machine:
>     +                self._args.extend(['-machine',
>     PSERIES_DEFAULT_CAPABILITIES])
>     +
>               self._pre_launch()
>               LOG.debug('VM launch command: %r', '
>     '.join(self._qemu_full_args))
> 
>     -- 
>     2.32.0
> 
> 
> Hm, okay.
> 
> I have plans to try and factor the machine appliance out and into an 
> upstream package in the near future, so I want to avoid more hardcoding 
> of defaults.
> 
> Does avocado have a subclass of QEMUMachine where it might be more 
> appropriate to stick this bandaid? Can we make one?
> 
> (I don't think iotests runs into this problem because we always use 
> machine:none there, I think. VM tests might have a similar problem 
> though, and then it'd be reasonable to want the bandaid here in 
> machine.py ... well, boo. okay.)
> 
> My verdict is that it's a bandaid, but I'll accept it if the avocado 
> folks agree to it and I'll sort it out later when I do my rewrite.
> 
> I don't think I have access to a power9 machine to test this with 
> either, so I might want a tested-by from someone who does.
> 
> --js
> 

Unfortunately, none of our POWER9 machines had a firmware old enough to 
be affected by this issue. The closest I can test is a nested KVM-HV 
with L0 using cap-cfpc=broken, so the L1 receives the quoted message 
when running 'make check-avocado'.

With this setup I can confirm that the patch fixes this error, so
Tested-by: Matheus Ferst <matheus.ferst@eldorado.org.br>

Thanks,
Matheus K. Ferst
Instituto de Pesquisas ELDORADO <http://www.eldorado.org.br/>
Analista de Software
Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html>
diff mbox series

Patch

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index 07ac5a710b..12e5e37bff 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -51,6 +51,11 @@ 
 
 
 LOG = logging.getLogger(__name__)
+PSERIES_DEFAULT_CAPABILITIES = ("cap-cfpc=broken,"
+                                "cap-sbbc=broken,"
+                                "cap-ibs=broken,"
+                                "cap-ccf-assist=off,"
+                                "cap-fwnmi=off")
 
 
 class QEMUMachineError(Exception):
@@ -447,6 +452,14 @@  def _launch(self) -> None:
         """
         Launch the VM and establish a QMP connection
         """
+
+        # pseries needs extra machine options to disable Spectre/Meltdown
+        # KVM related capabilities that might not be available in the
+        # host.
+        if "qemu-system-ppc64" in self._binary:
+            if self._machine is None or "pseries" in self._machine:
+                self._args.extend(['-machine', PSERIES_DEFAULT_CAPABILITIES])
+
         self._pre_launch()
         LOG.debug('VM launch command: %r', ' '.join(self._qemu_full_args))