x86: host-phys-bits-limit option

Message ID	20181211192527.13254-1-ehabkost@redhat.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org> From: Eduardo Habkost <ehabkost@redhat.com> To: qemu-devel@nongnu.org Date: Tue, 11 Dec 2018 17:25:27 -0200 Message-Id: <20181211192527.13254-1-ehabkost@redhat.com> Subject: [Qemu-devel] [PATCH] x86: host-phys-bits-limit option Precedence: list Cc: Andrea Arcangeli <aarcange@redhat.com>, Eduardo Habkost <ehabkost@redhat.com>, paul.c.lai@intel.com, Bandan Das <bsd@redhat.com>, Yu Zhang <yu.c.zhang@linux.intel.com>, Igor Mammedov <imammedo@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, Richard Henderson <rth@twiddle.net> Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>
Series	x86: host-phys-bits-limit option \| expand x86: host-phys-bits-limit option

Eduardo Habkost Dec. 11, 2018, 7:25 p.m. UTC

Some downstream distributions of QEMU set host-phys-bits=on by
default.  This worked very well for most use cases, because
phys-bits really didn't have huge consequences. The only
difference was on the CPUID data seen by guests, and on the
handling of reserved bits.

This changed in KVM commit 855feb673640 ("KVM: MMU: Add 5 level
EPT & Shadow page table support").  Now choosing a large
phys-bits value for a VM has bigger impact: it will make KVM use
5-level EPT even when it's not really necessary.  This means
using the host phys-bits value may not be the best choice.

Management software could address this problem by manually
configuring phys-bits depending on the size of the VM and the
amount of MMIO address space required for hotplug.  But this is
not trivial to implement.

However, there's another workaround that would work for most
cases: keep using the host phys-bits value, but only if it's
smaller than 48.  This patch makes this possible by introducing a
new "-cpu" option: "host-phys-bits-limit".  Management software
or users can make sure they will always use 4-level EPT using:
"host-phys-bits=on,host-phys-bits-limit=48".

This behavior is still not enabled by default because QEMU
doesn't enable host-phys-bits=on by default.  But users,
management software, or downstream distributions may choose to
change their defaults using the new option.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
---
 target/i386/cpu.h                 |  3 ++
 target/i386/cpu.c                 |  5 ++
 tests/acceptance/x86-phys-bits.py | 81 +++++++++++++++++++++++++++++++
 3 files changed, 89 insertions(+)
 create mode 100644 tests/acceptance/x86-phys-bits.py

Yu Zhang Dec. 12, 2018, 9:08 a.m. UTC | #1

On Tue, Dec 11, 2018 at 05:25:27PM -0200, Eduardo Habkost wrote:
> Some downstream distributions of QEMU set host-phys-bits=on by
> default.  This worked very well for most use cases, because
> phys-bits really didn't have huge consequences. The only
> difference was on the CPUID data seen by guests, and on the
> handling of reserved bits.
> 
> This changed in KVM commit 855feb673640 ("KVM: MMU: Add 5 level
> EPT & Shadow page table support").  Now choosing a large
> phys-bits value for a VM has bigger impact: it will make KVM use
> 5-level EPT even when it's not really necessary.  This means
> using the host phys-bits value may not be the best choice.
> 
> Management software could address this problem by manually
> configuring phys-bits depending on the size of the VM and the
> amount of MMIO address space required for hotplug.  But this is
> not trivial to implement.
> 
> However, there's another workaround that would work for most
> cases: keep using the host phys-bits value, but only if it's
> smaller than 48.  This patch makes this possible by introducing a
> new "-cpu" option: "host-phys-bits-limit".  Management software
> or users can make sure they will always use 4-level EPT using:
> "host-phys-bits=on,host-phys-bits-limit=48".
> 
> This behavior is still not enabled by default because QEMU
> doesn't enable host-phys-bits=on by default.  But users,
> management software, or downstream distributions may choose to
> change their defaults using the new option.
> 
> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

Thanks, Eduardo. One question is, should we check host-phys-bits-limit
against 48? If not, how about we just say in the commit message, that
the suggested value of host-phys-bits-limit is no bigger than 48 to
ensure a 4-level EPT? :-)

B.R.
Yu

> ---
>  target/i386/cpu.h                 |  3 ++
>  target/i386/cpu.c                 |  5 ++
>  tests/acceptance/x86-phys-bits.py | 81 +++++++++++++++++++++++++++++++
>  3 files changed, 89 insertions(+)
>  create mode 100644 tests/acceptance/x86-phys-bits.py
> 
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 9c52d0cbeb..a545b77c2c 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1457,6 +1457,9 @@ struct X86CPU {
>      /* if true override the phys_bits value with a value read from the host */
>      bool host_phys_bits;
>  
> +    /* if set, limit maximum value for phys_bits when host_phys_bits is true */
> +    uint8_t host_phys_bits_limit;
> +
>      /* Stop SMI delivery for migration compatibility with old machines */
>      bool kvm_no_smi_migration;
>  
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index f81d35e1f9..df200754a2 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -5152,6 +5152,10 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
>              if (cpu->host_phys_bits) {
>                  /* The user asked for us to use the host physical bits */
>                  cpu->phys_bits = host_phys_bits;
> +                if (cpu->host_phys_bits_limit &&
> +                    cpu->phys_bits > cpu->host_phys_bits_limit) {
> +                    cpu->phys_bits = cpu->host_phys_bits_limit;
> +                }
>              }
>  
>              /* Print a warning if the user set it to a value that's not the
> @@ -5739,6 +5743,7 @@ static Property x86_cpu_properties[] = {
>      DEFINE_PROP_BOOL("kvm", X86CPU, expose_kvm, true),
>      DEFINE_PROP_UINT32("phys-bits", X86CPU, phys_bits, 0),
>      DEFINE_PROP_BOOL("host-phys-bits", X86CPU, host_phys_bits, false),
> +    DEFINE_PROP_UINT8("host-phys-bits-limit", X86CPU, host_phys_bits_limit, 0),
>      DEFINE_PROP_BOOL("fill-mtrr-mask", X86CPU, fill_mtrr_mask, true),
>      DEFINE_PROP_UINT32("level", X86CPU, env.cpuid_level, UINT32_MAX),
>      DEFINE_PROP_UINT32("xlevel", X86CPU, env.cpuid_xlevel, UINT32_MAX),
> diff --git a/tests/acceptance/x86-phys-bits.py b/tests/acceptance/x86-phys-bits.py
> new file mode 100644
> index 0000000000..ae85d7e4e7
> --- /dev/null
> +++ b/tests/acceptance/x86-phys-bits.py
> @@ -0,0 +1,81 @@
> +# Test for host-phys-bits-limit option
> +#
> +# Copyright (c) 2018 Red Hat, Inc.
> +#
> +# Author:
> +#  Eduardo Habkost <ehabkost@redhat.com>
> +#
> +# This work is licensed under the terms of the GNU GPL, version 2 or
> +# later.  See the COPYING file in the top-level directory.
> +import re
> +
> +from avocado_qemu import Test
> +
> +class HostPhysbits(Test):
> +    """
> +    Check if `host-phys-bits` and `host-phys-bits` options work.
> +
> +    :avocado: enable
> +    :avocado: tags=x86_64
> +    """
> +
> +    def cpu_qom_get(self, prop):
> +        qom_path = self.vm.command('query-cpus')[0]['qom_path']
> +        return self.vm.command('qom-get', path=qom_path, property=prop)
> +
> +    def cpu_phys_bits(self):
> +        return self.cpu_qom_get('phys-bits')
> +
> +    def host_phys_bits(self):
> +        cpuinfo = open('/proc/cpuinfo', 'rb').read()
> +        m = re.search(b'([0-9]+) bits physical', cpuinfo)
> +        if m is None:
> +            self.cancel("Couldn't read phys-bits from /proc/cpuinfo")
> +        return int(m.group(1))
> +
> +    def setUp(self):
> +        super(HostPhysbits, self).setUp()
> +        self.vm.add_args('-S', '-machine', 'accel=kvm:tcg')
> +        self.vm.launch()
> +        if not self.vm.command('query-kvm')['enabled']:
> +            self.cancel("Test case requires KVM")
> +        self.vm.shutdown()
> +
> +
> +    def test_no_host_phys_bits(self):
> +        # default should be phys-bits=40 if host-phys-bits=off
> +        self.vm.add_args('-cpu', 'qemu64,host-phys-bits=off')
> +        self.vm.launch()
> +        self.assertEqual(self.cpu_phys_bits(), 40)
> +
> +    def test_manual_phys_bits(self):
> +        self.vm.add_args('-cpu', 'qemu64,host-phys-bits=off,phys-bits=35')
> +        self.vm.launch()
> +        self.assertEqual(self.cpu_phys_bits(), 35)
> +
> +    def test_host_phys_bits(self):
> +        host_phys_bits = self.host_phys_bits()
> +        self.vm.add_args('-cpu', 'qemu64,host-phys-bits=on')
> +        self.vm.launch()
> +        self.assertEqual(self.cpu_phys_bits(), host_phys_bits)
> +
> +    def test_host_phys_bits_limit_high(self):
> +        hbits = self.host_phys_bits()
> +        self.vm.add_args('-cpu', 'qemu64,host-phys-bits=on,'
> +                                 'host-phys-bits-limit=%d' % (hbits + 1))
> +        self.vm.launch()
> +        self.assertEqual(self.cpu_phys_bits(), hbits)
> +
> +    def test_host_phys_bits_limit_equal(self):
> +        hbits = self.host_phys_bits()
> +        self.vm.add_args('-cpu', 'qemu64,host-phys-bits=on,'
> +                                 'host-phys-bits-limit=%d' % (hbits))
> +        self.vm.launch()
> +        self.assertEqual(self.cpu_phys_bits(), hbits)
> +
> +    def test_host_phys_bits_limit_low(self):
> +        hbits = self.host_phys_bits()
> +        self.vm.add_args('-cpu', 'qemu64,host-phys-bits=on,'
> +                                 'host-phys-bits-limit=%d' % (hbits - 1))
> +        self.vm.launch()
> +        self.assertEqual(self.cpu_phys_bits(), hbits - 1)
> -- 
> 2.18.0.rc1.1.g3f1ff2140
> 
>

Wainer dos Santos Moschetta Dec. 12, 2018, 2:08 p.m. UTC | #2

On 12/11/2018 05:25 PM, Eduardo Habkost wrote:
> Some downstream distributions of QEMU set host-phys-bits=on by
> default.  This worked very well for most use cases, because
> phys-bits really didn't have huge consequences. The only
> difference was on the CPUID data seen by guests, and on the
> handling of reserved bits.
>
> This changed in KVM commit 855feb673640 ("KVM: MMU: Add 5 level
> EPT & Shadow page table support").  Now choosing a large
> phys-bits value for a VM has bigger impact: it will make KVM use
> 5-level EPT even when it's not really necessary.  This means
> using the host phys-bits value may not be the best choice.
>
> Management software could address this problem by manually
> configuring phys-bits depending on the size of the VM and the
> amount of MMIO address space required for hotplug.  But this is
> not trivial to implement.
>
> However, there's another workaround that would work for most
> cases: keep using the host phys-bits value, but only if it's
> smaller than 48.  This patch makes this possible by introducing a
> new "-cpu" option: "host-phys-bits-limit".  Management software
> or users can make sure they will always use 4-level EPT using:
> "host-phys-bits=on,host-phys-bits-limit=48".
>
> This behavior is still not enabled by default because QEMU
> doesn't enable host-phys-bits=on by default.  But users,
> management software, or downstream distributions may choose to
> change their defaults using the new option.
>
> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> ---
>   target/i386/cpu.h                 |  3 ++
>   target/i386/cpu.c                 |  5 ++
>   tests/acceptance/x86-phys-bits.py | 81 +++++++++++++++++++++++++++++++
>   3 files changed, 89 insertions(+)
>   create mode 100644 tests/acceptance/x86-phys-bits.py
>
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 9c52d0cbeb..a545b77c2c 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1457,6 +1457,9 @@ struct X86CPU {
>       /* if true override the phys_bits value with a value read from the host */
>       bool host_phys_bits;
>   
> +    /* if set, limit maximum value for phys_bits when host_phys_bits is true */
> +    uint8_t host_phys_bits_limit;
> +
>       /* Stop SMI delivery for migration compatibility with old machines */
>       bool kvm_no_smi_migration;
>   
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index f81d35e1f9..df200754a2 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -5152,6 +5152,10 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
>               if (cpu->host_phys_bits) {
>                   /* The user asked for us to use the host physical bits */
>                   cpu->phys_bits = host_phys_bits;
> +                if (cpu->host_phys_bits_limit &&
> +                    cpu->phys_bits > cpu->host_phys_bits_limit) {
> +                    cpu->phys_bits = cpu->host_phys_bits_limit;
> +                }
>               }
>   
>               /* Print a warning if the user set it to a value that's not the
> @@ -5739,6 +5743,7 @@ static Property x86_cpu_properties[] = {
>       DEFINE_PROP_BOOL("kvm", X86CPU, expose_kvm, true),
>       DEFINE_PROP_UINT32("phys-bits", X86CPU, phys_bits, 0),
>       DEFINE_PROP_BOOL("host-phys-bits", X86CPU, host_phys_bits, false),
> +    DEFINE_PROP_UINT8("host-phys-bits-limit", X86CPU, host_phys_bits_limit, 0),
>       DEFINE_PROP_BOOL("fill-mtrr-mask", X86CPU, fill_mtrr_mask, true),
>       DEFINE_PROP_UINT32("level", X86CPU, env.cpuid_level, UINT32_MAX),
>       DEFINE_PROP_UINT32("xlevel", X86CPU, env.cpuid_xlevel, UINT32_MAX),

The changes to introduce the host-phys-bits-limit option looks good to 
me, although I haven't much knowledge on this area anyway...

> diff --git a/tests/acceptance/x86-phys-bits.py b/tests/acceptance/x86-phys-bits.py
> new file mode 100644
> index 0000000000..ae85d7e4e7
> --- /dev/null
> +++ b/tests/acceptance/x86-phys-bits.py

PEP8 does not have a convention for Python files naming, although it 
suggests the use of underscore on module and package names. We already 
have the tests/acceptance/boot_linux_console.py with underscore, so I 
think we should make it a convention.


> @@ -0,0 +1,81 @@
> +# Test for host-phys-bits-limit option
> +#
> +# Copyright (c) 2018 Red Hat, Inc.
> +#
> +# Author:
> +#  Eduardo Habkost <ehabkost@redhat.com>
> +#
> +# This work is licensed under the terms of the GNU GPL, version 2 or
> +# later.  See the COPYING file in the top-level directory.
> +import re
> +
> +from avocado_qemu import Test
> +
> +class HostPhysbits(Test):
> +    """
> +    Check if `host-phys-bits` and `host-phys-bits` options work.

Did you mean "`host-phys-bits` and `host-phys-bits-limit`?

> +
> +    :avocado: enable
> +    :avocado: tags=x86_64
> +    """
> +
> +    def cpu_qom_get(self, prop):
> +        qom_path = self.vm.command('query-cpus')[0]['qom_path']
> +        return self.vm.command('qom-get', path=qom_path, property=prop)
> +
> +    def cpu_phys_bits(self):
> +        return self.cpu_qom_get('phys-bits')
> +
> +    def host_phys_bits(self):
> +        cpuinfo = open('/proc/cpuinfo', 'rb').read()
> +        m = re.search(b'([0-9]+) bits physical', cpuinfo)

pylint says:

"tests/acceptance/x86-phys-bits.py:31:8: C0103: Variable name "m" 
doesn't conform to snake_case naming style (invalid-name)"

It also warns about "Missing method docstring" for each test method. I 
don't know whether we should convention that every method must have a 
docstring explaining the test, or just ignore that warn.

> +        if m is None:
> +            self.cancel("Couldn't read phys-bits from /proc/cpuinfo")

I suggest something like 'bits physical size' instead of 'phys-bits'.

> +        return int(m.group(1))
> +
> +    def setUp(self):
> +        super(HostPhysbits, self).setUp()
> +        self.vm.add_args('-S', '-machine', 'accel=kvm:tcg')

I'm curious to know why you freeze the CPU at start (-S).

> +        self.vm.launch()
> +        if not self.vm.command('query-kvm')['enabled']:
> +            self.cancel("Test case requires KVM")
> +        self.vm.shutdown()
> +
> +
> +    def test_no_host_phys_bits(self):
> +        # default should be phys-bits=40 if host-phys-bits=off

Perhaps put that comment in a docstring block.

> +        self.vm.add_args('-cpu', 'qemu64,host-phys-bits=off')
> +        self.vm.launch()
> +        self.assertEqual(self.cpu_phys_bits(), 40)
> +
> +    def test_manual_phys_bits(self):
> +        self.vm.add_args('-cpu', 'qemu64,host-phys-bits=off,phys-bits=35')
> +        self.vm.launch()
> +        self.assertEqual(self.cpu_phys_bits(), 35)
> +
> +    def test_host_phys_bits(self):
> +        host_phys_bits = self.host_phys_bits()
> +        self.vm.add_args('-cpu', 'qemu64,host-phys-bits=on')
> +        self.vm.launch()
> +        self.assertEqual(self.cpu_phys_bits(), host_phys_bits)
> +
> +    def test_host_phys_bits_limit_high(self):
> +        hbits = self.host_phys_bits()
> +        self.vm.add_args('-cpu', 'qemu64,host-phys-bits=on,'
> +                                 'host-phys-bits-limit=%d' % (hbits + 1))
> +        self.vm.launch()
> +        self.assertEqual(self.cpu_phys_bits(), hbits)
> +
> +    def test_host_phys_bits_limit_equal(self):
> +        hbits = self.host_phys_bits()
> +        self.vm.add_args('-cpu', 'qemu64,host-phys-bits=on,'
> +                                 'host-phys-bits-limit=%d' % (hbits))
> +        self.vm.launch()
> +        self.assertEqual(self.cpu_phys_bits(), hbits)
> +
> +    def test_host_phys_bits_limit_low(self):
> +        hbits = self.host_phys_bits()
> +        self.vm.add_args('-cpu', 'qemu64,host-phys-bits=on,'
> +                                 'host-phys-bits-limit=%d' % (hbits - 1))
> +        self.vm.launch()
> +        self.assertEqual(self.cpu_phys_bits(), hbits - 1)

Suppose an user sets both host-phys-bits and phys-bits by mistake. What 
is the expected outcome? If this case is relevant...below test case 
fails (phys-bits is set to host's):

     def test_manual_phys_bits_mistake(self):
         """
         By mistake set host-phys-bits=on and phys-bits
         """
         self.vm.add_args('-cpu', 'qemu64,host-phys-bits=on,phys-bits=35')
         self.vm.launch()
         self.assertEqual(self.cpu_phys_bits(), 35)

- Wainer

Eduardo Habkost Dec. 12, 2018, 2:12 p.m. UTC | #3

On Wed, Dec 12, 2018 at 05:08:39PM +0800, Yu Zhang wrote:
> On Tue, Dec 11, 2018 at 05:25:27PM -0200, Eduardo Habkost wrote:
> > Some downstream distributions of QEMU set host-phys-bits=on by
> > default.  This worked very well for most use cases, because
> > phys-bits really didn't have huge consequences. The only
> > difference was on the CPUID data seen by guests, and on the
> > handling of reserved bits.
> > 
> > This changed in KVM commit 855feb673640 ("KVM: MMU: Add 5 level
> > EPT & Shadow page table support").  Now choosing a large
> > phys-bits value for a VM has bigger impact: it will make KVM use
> > 5-level EPT even when it's not really necessary.  This means
> > using the host phys-bits value may not be the best choice.
> > 
> > Management software could address this problem by manually
> > configuring phys-bits depending on the size of the VM and the
> > amount of MMIO address space required for hotplug.  But this is
> > not trivial to implement.
> > 
> > However, there's another workaround that would work for most
> > cases: keep using the host phys-bits value, but only if it's
> > smaller than 48.  This patch makes this possible by introducing a
> > new "-cpu" option: "host-phys-bits-limit".  Management software
> > or users can make sure they will always use 4-level EPT using:
> > "host-phys-bits=on,host-phys-bits-limit=48".
> > 
> > This behavior is still not enabled by default because QEMU
> > doesn't enable host-phys-bits=on by default.  But users,
> > management software, or downstream distributions may choose to
> > change their defaults using the new option.
> > 
> > Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> 
> Thanks, Eduardo. One question is, should we check host-phys-bits-limit
> against 48? If not, how about we just say in the commit message, that
> the suggested value of host-phys-bits-limit is no bigger than 48 to
> ensure a 4-level EPT? :-)

I'm not sure I understood the question.  I tried to document this
at:

|                                      [...]  Management software
| or users can make sure they will always use 4-level EPT using:
| "host-phys-bits=on,host-phys-bits-limit=48".

Eduardo Habkost Dec. 12, 2018, 2:29 p.m. UTC | #4

On Wed, Dec 12, 2018 at 12:08:08PM -0200, Wainer dos Santos Moschetta wrote:
> 
> On 12/11/2018 05:25 PM, Eduardo Habkost wrote:
> > Some downstream distributions of QEMU set host-phys-bits=on by
> > default.  This worked very well for most use cases, because
> > phys-bits really didn't have huge consequences. The only
> > difference was on the CPUID data seen by guests, and on the
> > handling of reserved bits.
> > 
> > This changed in KVM commit 855feb673640 ("KVM: MMU: Add 5 level
> > EPT & Shadow page table support").  Now choosing a large
> > phys-bits value for a VM has bigger impact: it will make KVM use
> > 5-level EPT even when it's not really necessary.  This means
> > using the host phys-bits value may not be the best choice.
> > 
> > Management software could address this problem by manually
> > configuring phys-bits depending on the size of the VM and the
> > amount of MMIO address space required for hotplug.  But this is
> > not trivial to implement.
> > 
> > However, there's another workaround that would work for most
> > cases: keep using the host phys-bits value, but only if it's
> > smaller than 48.  This patch makes this possible by introducing a
> > new "-cpu" option: "host-phys-bits-limit".  Management software
> > or users can make sure they will always use 4-level EPT using:
> > "host-phys-bits=on,host-phys-bits-limit=48".
> > 
> > This behavior is still not enabled by default because QEMU
> > doesn't enable host-phys-bits=on by default.  But users,
> > management software, or downstream distributions may choose to
> > change their defaults using the new option.
> > 
> > Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> > ---
> >   target/i386/cpu.h                 |  3 ++
> >   target/i386/cpu.c                 |  5 ++
> >   tests/acceptance/x86-phys-bits.py | 81 +++++++++++++++++++++++++++++++
> >   3 files changed, 89 insertions(+)
> >   create mode 100644 tests/acceptance/x86-phys-bits.py
> > 
> > diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> > index 9c52d0cbeb..a545b77c2c 100644
> > --- a/target/i386/cpu.h
> > +++ b/target/i386/cpu.h
> > @@ -1457,6 +1457,9 @@ struct X86CPU {
> >       /* if true override the phys_bits value with a value read from the host */
> >       bool host_phys_bits;
> > +    /* if set, limit maximum value for phys_bits when host_phys_bits is true */
> > +    uint8_t host_phys_bits_limit;
> > +
> >       /* Stop SMI delivery for migration compatibility with old machines */
> >       bool kvm_no_smi_migration;
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index f81d35e1f9..df200754a2 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -5152,6 +5152,10 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
> >               if (cpu->host_phys_bits) {
> >                   /* The user asked for us to use the host physical bits */
> >                   cpu->phys_bits = host_phys_bits;
> > +                if (cpu->host_phys_bits_limit &&
> > +                    cpu->phys_bits > cpu->host_phys_bits_limit) {
> > +                    cpu->phys_bits = cpu->host_phys_bits_limit;
> > +                }
> >               }
> >               /* Print a warning if the user set it to a value that's not the
> > @@ -5739,6 +5743,7 @@ static Property x86_cpu_properties[] = {
> >       DEFINE_PROP_BOOL("kvm", X86CPU, expose_kvm, true),
> >       DEFINE_PROP_UINT32("phys-bits", X86CPU, phys_bits, 0),
> >       DEFINE_PROP_BOOL("host-phys-bits", X86CPU, host_phys_bits, false),
> > +    DEFINE_PROP_UINT8("host-phys-bits-limit", X86CPU, host_phys_bits_limit, 0),
> >       DEFINE_PROP_BOOL("fill-mtrr-mask", X86CPU, fill_mtrr_mask, true),
> >       DEFINE_PROP_UINT32("level", X86CPU, env.cpuid_level, UINT32_MAX),
> >       DEFINE_PROP_UINT32("xlevel", X86CPU, env.cpuid_xlevel, UINT32_MAX),
> 
> The changes to introduce the host-phys-bits-limit option looks good to me,
> although I haven't much knowledge on this area anyway...
> 
> > diff --git a/tests/acceptance/x86-phys-bits.py b/tests/acceptance/x86-phys-bits.py
> > new file mode 100644
> > index 0000000000..ae85d7e4e7
> > --- /dev/null
> > +++ b/tests/acceptance/x86-phys-bits.py
> 
> PEP8 does not have a convention for Python files naming, although it
> suggests the use of underscore on module and package names. We already have
> the tests/acceptance/boot_linux_console.py with underscore, so I think we
> should make it a convention.

Agreed: it's better to be consistent.

> 
> 
> > @@ -0,0 +1,81 @@
> > +# Test for host-phys-bits-limit option
> > +#
> > +# Copyright (c) 2018 Red Hat, Inc.
> > +#
> > +# Author:
> > +#  Eduardo Habkost <ehabkost@redhat.com>
> > +#
> > +# This work is licensed under the terms of the GNU GPL, version 2 or
> > +# later.  See the COPYING file in the top-level directory.
> > +import re
> > +
> > +from avocado_qemu import Test
> > +
> > +class HostPhysbits(Test):
> > +    """
> > +    Check if `host-phys-bits` and `host-phys-bits` options work.
> 
> Did you mean "`host-phys-bits` and `host-phys-bits-limit`?

Yes, thanks for catching it!

> 
> > +
> > +    :avocado: enable
> > +    :avocado: tags=x86_64
> > +    """
> > +
> > +    def cpu_qom_get(self, prop):
> > +        qom_path = self.vm.command('query-cpus')[0]['qom_path']
> > +        return self.vm.command('qom-get', path=qom_path, property=prop)
> > +
> > +    def cpu_phys_bits(self):
> > +        return self.cpu_qom_get('phys-bits')
> > +
> > +    def host_phys_bits(self):
> > +        cpuinfo = open('/proc/cpuinfo', 'rb').read()
> > +        m = re.search(b'([0-9]+) bits physical', cpuinfo)
> 
> pylint says:
> 
> "tests/acceptance/x86-phys-bits.py:31:8: C0103: Variable name "m" doesn't
> conform to snake_case naming style (invalid-name)"

This is one of those cases where I think a more descriptive name
would make the code less readable.

> 
> It also warns about "Missing method docstring" for each test method. I don't
> know whether we should convention that every method must have a docstring
> explaining the test, or just ignore that warn.

I don't think we should require docstrings for all test methods.
I'm already bothered by the current amount of boilerplate
required by the unittest-like Avocado API.

> 
> > +        if m is None:
> > +            self.cancel("Couldn't read phys-bits from /proc/cpuinfo")
> 
> I suggest something like 'bits physical size' instead of 'phys-bits'.

"physical address size" would be more accurate, I will change it.


> 
> > +        return int(m.group(1))
> > +
> > +    def setUp(self):
> > +        super(HostPhysbits, self).setUp()
> > +        self.vm.add_args('-S', '-machine', 'accel=kvm:tcg')
> 
> I'm curious to know why you freeze the CPU at start (-S).

Starting the VCPUs would be pointless, as there's no guest code
to run at all.  It would just waste CPU cycles stuck in BIOS boot
code.

> 
> > +        self.vm.launch()
> > +        if not self.vm.command('query-kvm')['enabled']:
> > +            self.cancel("Test case requires KVM")
> > +        self.vm.shutdown()
> > +
> > +
> > +    def test_no_host_phys_bits(self):
> > +        # default should be phys-bits=40 if host-phys-bits=off
> 
> Perhaps put that comment in a docstring block.

I considered that, but then I thought that the comment doesn't
describe the purpose of the test method at all.  It just
clarifies why we are testing for phys-bits=40 below.


> 
> > +        self.vm.add_args('-cpu', 'qemu64,host-phys-bits=off')
> > +        self.vm.launch()
> > +        self.assertEqual(self.cpu_phys_bits(), 40)
[...]
> 
> Suppose an user sets both host-phys-bits and phys-bits by mistake. What is
> the expected outcome? If this case is relevant...below test case fails
> (phys-bits is set to host's):
> 
>     def test_manual_phys_bits_mistake(self):
>         """
>         By mistake set host-phys-bits=on and phys-bits
>         """
>         self.vm.add_args('-cpu', 'qemu64,host-phys-bits=on,phys-bits=35')
>         self.vm.launch()
>         self.assertEqual(self.cpu_phys_bits(), 35)

host-phys-bits overrides phys-bits, but I don't think we need to
make it a supported use case.

Yu Zhang Dec. 13, 2018, 12:52 a.m. UTC | #5

On Wed, Dec 12, 2018 at 12:12:33PM -0200, Eduardo Habkost wrote:
> On Wed, Dec 12, 2018 at 05:08:39PM +0800, Yu Zhang wrote:
> > On Tue, Dec 11, 2018 at 05:25:27PM -0200, Eduardo Habkost wrote:
> > > Some downstream distributions of QEMU set host-phys-bits=on by
> > > default.  This worked very well for most use cases, because
> > > phys-bits really didn't have huge consequences. The only
> > > difference was on the CPUID data seen by guests, and on the
> > > handling of reserved bits.
> > > 
> > > This changed in KVM commit 855feb673640 ("KVM: MMU: Add 5 level
> > > EPT & Shadow page table support").  Now choosing a large
> > > phys-bits value for a VM has bigger impact: it will make KVM use
> > > 5-level EPT even when it's not really necessary.  This means
> > > using the host phys-bits value may not be the best choice.
> > > 
> > > Management software could address this problem by manually
> > > configuring phys-bits depending on the size of the VM and the
> > > amount of MMIO address space required for hotplug.  But this is
> > > not trivial to implement.
> > > 
> > > However, there's another workaround that would work for most
> > > cases: keep using the host phys-bits value, but only if it's
> > > smaller than 48.  This patch makes this possible by introducing a
> > > new "-cpu" option: "host-phys-bits-limit".  Management software
> > > or users can make sure they will always use 4-level EPT using:
> > > "host-phys-bits=on,host-phys-bits-limit=48".
> > > 
> > > This behavior is still not enabled by default because QEMU
> > > doesn't enable host-phys-bits=on by default.  But users,
> > > management software, or downstream distributions may choose to
> > > change their defaults using the new option.
> > > 
> > > Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> > 
> > Thanks, Eduardo. One question is, should we check host-phys-bits-limit
> > against 48? If not, how about we just say in the commit message, that
> > the suggested value of host-phys-bits-limit is no bigger than 48 to
> > ensure a 4-level EPT? :-)
> 
> I'm not sure I understood the question.  I tried to document this
> at:
> 
> |                                      [...]  Management software
> | or users can make sure they will always use 4-level EPT using:
> | "host-phys-bits=on,host-phys-bits-limit=48".
> 

Oh, I was just saying that host-phys-bits-limit can be of any
value less than 48, instead of just 48, to ensure a 4-level EPT.
Not a big deal. I guess readers can get the point in your commit
message.  

Another question is, should we check the value of host-phys-bits?
Shall we accept values greater than 48(may be a useless configuration,
but still acceptible), or less than 32? 

Thanks
Yu

> -- 
> Eduardo
>

Eduardo Habkost Jan. 7, 2019, 5:23 p.m. UTC | #6

On Thu, Dec 13, 2018 at 08:52:31AM +0800, Yu Zhang wrote:
> On Wed, Dec 12, 2018 at 12:12:33PM -0200, Eduardo Habkost wrote:
> > On Wed, Dec 12, 2018 at 05:08:39PM +0800, Yu Zhang wrote:
> > > On Tue, Dec 11, 2018 at 05:25:27PM -0200, Eduardo Habkost wrote:
> > > > Some downstream distributions of QEMU set host-phys-bits=on by
> > > > default.  This worked very well for most use cases, because
> > > > phys-bits really didn't have huge consequences. The only
> > > > difference was on the CPUID data seen by guests, and on the
> > > > handling of reserved bits.
> > > > 
> > > > This changed in KVM commit 855feb673640 ("KVM: MMU: Add 5 level
> > > > EPT & Shadow page table support").  Now choosing a large
> > > > phys-bits value for a VM has bigger impact: it will make KVM use
> > > > 5-level EPT even when it's not really necessary.  This means
> > > > using the host phys-bits value may not be the best choice.
> > > > 
> > > > Management software could address this problem by manually
> > > > configuring phys-bits depending on the size of the VM and the
> > > > amount of MMIO address space required for hotplug.  But this is
> > > > not trivial to implement.
> > > > 
> > > > However, there's another workaround that would work for most
> > > > cases: keep using the host phys-bits value, but only if it's
> > > > smaller than 48.  This patch makes this possible by introducing a
> > > > new "-cpu" option: "host-phys-bits-limit".  Management software
> > > > or users can make sure they will always use 4-level EPT using:
> > > > "host-phys-bits=on,host-phys-bits-limit=48".
> > > > 
> > > > This behavior is still not enabled by default because QEMU
> > > > doesn't enable host-phys-bits=on by default.  But users,
> > > > management software, or downstream distributions may choose to
> > > > change their defaults using the new option.
> > > > 
> > > > Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> > > 
> > > Thanks, Eduardo. One question is, should we check host-phys-bits-limit
> > > against 48? If not, how about we just say in the commit message, that
> > > the suggested value of host-phys-bits-limit is no bigger than 48 to
> > > ensure a 4-level EPT? :-)
> > 
> > I'm not sure I understood the question.  I tried to document this
> > at:
> > 
> > |                                      [...]  Management software
> > | or users can make sure they will always use 4-level EPT using:
> > | "host-phys-bits=on,host-phys-bits-limit=48".
> > 
> 
> Oh, I was just saying that host-phys-bits-limit can be of any
> value less than 48, instead of just 48, to ensure a 4-level EPT.
> Not a big deal. I guess readers can get the point in your commit
> message.  
> 
> Another question is, should we check the value of host-phys-bits?
> Shall we accept values greater than 48(may be a useless configuration,
> but still acceptible), or less than 32? 

I'm not sure.  We can do that, but we can't just choose arbitrary
limits: we need to be able to and point to the specific
requirements/specifications behind them.

Setting host-phys-bits-limit > 48, for example, seems reasonable.
It would be useful to guarantee a VM is migratable to smaller
hosts.

Making sure phys-bits it is big enough to fit all RAM + MMIO
regions would be a good idea, though.  We would need the machine
code to ensure the address limits make sense.

x86: host-phys-bits-limit option

Commit Message

Comments

Patch