diff mbox

target-i386: Sanity check host processor physical address width

Message ID jpgy4iqi85i.fsf@linux.bootlegged.copy (mailing list archive)
State New, archived
Headers show

Commit Message

Bandan Das July 8, 2015, 10:42 p.m. UTC
If a Linux guest is assigned more memory than is supported
by the host processor, the guest is unable to boot. That
is expected, however, there's no message indicating the user
what went wrong. This change prints a message to stderr if
KVM has the corresponding capability.

Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Bandan Das <bsd@redhat.com>
---
 linux-headers/linux/kvm.h | 1 +
 target-i386/kvm.c         | 6 ++++++
 2 files changed, 7 insertions(+)

Comments

Laszlo Ersek July 9, 2015, 7:02 a.m. UTC | #1
On 07/09/15 00:42, Bandan Das wrote:
> 
> If a Linux guest is assigned more memory than is supported
> by the host processor, the guest is unable to boot. That
> is expected, however, there's no message indicating the user
> what went wrong. This change prints a message to stderr if
> KVM has the corresponding capability.
> 
> Reported-by: Laszlo Ersek <lersek@redhat.com>
> Signed-off-by: Bandan Das <bsd@redhat.com>
> ---
>  linux-headers/linux/kvm.h | 1 +
>  target-i386/kvm.c         | 6 ++++++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
> index 3bac873..6afad49 100644
> --- a/linux-headers/linux/kvm.h
> +++ b/linux-headers/linux/kvm.h
> @@ -817,6 +817,7 @@ struct kvm_ppc_smmu_info {
>  #define KVM_CAP_DISABLE_QUIRKS 116
>  #define KVM_CAP_X86_SMM 117
>  #define KVM_CAP_MULTI_ADDRESS_SPACE 118
> +#define KVM_CAP_PHY_ADDR_WIDTH 119
>  
>  #ifdef KVM_CAP_IRQ_ROUTING
>  
> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> index 066d03d..66e3448 100644
> --- a/target-i386/kvm.c
> +++ b/target-i386/kvm.c
> @@ -892,6 +892,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>      uint64_t shadow_mem;
>      int ret;
>      struct utsname utsname;
> +    int max_phys_bits;
>  
>      ret = kvm_get_supported_msrs(s);
>      if (ret < 0) {
> @@ -945,6 +946,11 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>          }
>      }
>  
> +    max_phys_bits = kvm_check_extension(s, KVM_CAP_PHY_ADDR_WIDTH);
> +    if (max_phys_bits && (1ULL << max_phys_bits) <= ram_size)
> +        fprintf(stderr, "Warning: The amount of memory assigned to the guest "
> +            "is more than that supported by the host CPU(s). Guest may be unstable.\n");
> +
>      if (kvm_check_extension(s, KVM_CAP_X86_SMM)) {
>          smram_machine_done.notify = register_smram_listener;
>          qemu_add_machine_init_done_notifier(&smram_machine_done);
> 

First, see my comments on the KVM patch.

Second, ram_size is not the right thing to compare. What should be
checked is whether the highest guest-physical address that maps to RAM
can be represented in the address width of the host processor (and only
if EPT is enabled, but that sub-condition belongs to the KVM patch).

Note that this is not the same as the check written in the patch. For
example, if you assume a 32-bit PCI hole with size 1 GB, then a total
guest RAM of size 63 GB will result in the highest guest-phys memory
address being 0xF_FFFF_FFFF, which just fits into 36 bits.

Correspondingly, the above code would not print the warning for

  -m $((63 * 1024 + 1))

on my laptop (which has "address sizes   : 36 bits physical, ..."), even
though such a guest would not boot for me (with EPT enabled).

Please see

http://thread.gmane.org/gmane.comp.bios.tianocore.devel/15418/focus=15447

So, "ram_size" in the controlling expression should be replaced with
"maximum_guest_ram_address" (which should be inclusive, and the <= relop
should be preserved).

Thanks
Laszlo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paolo Bonzini July 9, 2015, 7:59 a.m. UTC | #2
On 09/07/2015 00:42, Bandan Das wrote:
> 
> If a Linux guest is assigned more memory than is supported
> by the host processor, the guest is unable to boot. That
> is expected, however, there's no message indicating the user
> what went wrong. This change prints a message to stderr if
> KVM has the corresponding capability.
> 
> Reported-by: Laszlo Ersek <lersek@redhat.com>
> Signed-off-by: Bandan Das <bsd@redhat.com>

This is not entirely correct, because it doesn't take the PCI hole into
account.

Perhaps KVM could simply hide memory above the limit (i.e. treat it as
MMIO), and the BIOS could remove RAM above the limit from the e820
memory map?

Paolo

> ---
>  linux-headers/linux/kvm.h | 1 +
>  target-i386/kvm.c         | 6 ++++++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
> index 3bac873..6afad49 100644
> --- a/linux-headers/linux/kvm.h
> +++ b/linux-headers/linux/kvm.h
> @@ -817,6 +817,7 @@ struct kvm_ppc_smmu_info {
>  #define KVM_CAP_DISABLE_QUIRKS 116
>  #define KVM_CAP_X86_SMM 117
>  #define KVM_CAP_MULTI_ADDRESS_SPACE 118
> +#define KVM_CAP_PHY_ADDR_WIDTH 119
>  
>  #ifdef KVM_CAP_IRQ_ROUTING
>  
> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> index 066d03d..66e3448 100644
> --- a/target-i386/kvm.c
> +++ b/target-i386/kvm.c
> @@ -892,6 +892,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>      uint64_t shadow_mem;
>      int ret;
>      struct utsname utsname;
> +    int max_phys_bits;
>  
>      ret = kvm_get_supported_msrs(s);
>      if (ret < 0) {
> @@ -945,6 +946,11 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>          }
>      }
>  
> +    max_phys_bits = kvm_check_extension(s, KVM_CAP_PHY_ADDR_WIDTH);
> +    if (max_phys_bits && (1ULL << max_phys_bits) <= ram_size)
> +        fprintf(stderr, "Warning: The amount of memory assigned to the guest "
> +            "is more than that supported by the host CPU(s). Guest may be unstable.\n");
> +
>      if (kvm_check_extension(s, KVM_CAP_X86_SMM)) {
>          smram_machine_done.notify = register_smram_listener;
>          qemu_add_machine_init_done_notifier(&smram_machine_done);
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Laszlo Ersek July 9, 2015, 8:26 a.m. UTC | #3
On 07/09/15 09:59, Paolo Bonzini wrote:
> 
> 
> On 09/07/2015 00:42, Bandan Das wrote:
>>
>> If a Linux guest is assigned more memory than is supported
>> by the host processor, the guest is unable to boot. That
>> is expected, however, there's no message indicating the user
>> what went wrong. This change prints a message to stderr if
>> KVM has the corresponding capability.
>>
>> Reported-by: Laszlo Ersek <lersek@redhat.com>
>> Signed-off-by: Bandan Das <bsd@redhat.com>
> 
> This is not entirely correct, because it doesn't take the PCI hole into
> account.
> 
> Perhaps KVM could simply hide memory above the limit (i.e. treat it as
> MMIO), and the BIOS could remove RAM above the limit from the e820
> memory map?

I'd prefer to leave the guest firmware*s* out of this... :)

E820 is a legacy BIOS concept. In OVMF we'd have to hack the memory
resource descriptor HOBs (which in turn control the DXE memory space
map, which in turn controls the UEFI memory map). Those HOBs are
currently based on what the CMOS reports about the RAM available under
and above 4GB.

It's pretty complex already (will get more complex with SMM support),
and TBH, for working around such an obscure issue, I wouldn't like to
complicate it even further...

After all, this is a host platform limitation. The solution should be to
either move to a more capable host, or do it in software (disable EPT).

Thanks!
Laszlo

> 
> Paolo
> 
>> ---
>>  linux-headers/linux/kvm.h | 1 +
>>  target-i386/kvm.c         | 6 ++++++
>>  2 files changed, 7 insertions(+)
>>
>> diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
>> index 3bac873..6afad49 100644
>> --- a/linux-headers/linux/kvm.h
>> +++ b/linux-headers/linux/kvm.h
>> @@ -817,6 +817,7 @@ struct kvm_ppc_smmu_info {
>>  #define KVM_CAP_DISABLE_QUIRKS 116
>>  #define KVM_CAP_X86_SMM 117
>>  #define KVM_CAP_MULTI_ADDRESS_SPACE 118
>> +#define KVM_CAP_PHY_ADDR_WIDTH 119
>>  
>>  #ifdef KVM_CAP_IRQ_ROUTING
>>  
>> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
>> index 066d03d..66e3448 100644
>> --- a/target-i386/kvm.c
>> +++ b/target-i386/kvm.c
>> @@ -892,6 +892,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>>      uint64_t shadow_mem;
>>      int ret;
>>      struct utsname utsname;
>> +    int max_phys_bits;
>>  
>>      ret = kvm_get_supported_msrs(s);
>>      if (ret < 0) {
>> @@ -945,6 +946,11 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>>          }
>>      }
>>  
>> +    max_phys_bits = kvm_check_extension(s, KVM_CAP_PHY_ADDR_WIDTH);
>> +    if (max_phys_bits && (1ULL << max_phys_bits) <= ram_size)
>> +        fprintf(stderr, "Warning: The amount of memory assigned to the guest "
>> +            "is more than that supported by the host CPU(s). Guest may be unstable.\n");
>> +
>>      if (kvm_check_extension(s, KVM_CAP_X86_SMM)) {
>>          smram_machine_done.notify = register_smram_listener;
>>          qemu_add_machine_init_done_notifier(&smram_machine_done);
>>

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Igor Mammedov July 9, 2015, 9:27 a.m. UTC | #4
On Thu, 09 Jul 2015 09:02:38 +0200
Laszlo Ersek <lersek@redhat.com> wrote:

> On 07/09/15 00:42, Bandan Das wrote:
> > 
> > If a Linux guest is assigned more memory than is supported
> > by the host processor, the guest is unable to boot. That
> > is expected, however, there's no message indicating the user
> > what went wrong. This change prints a message to stderr if
> > KVM has the corresponding capability.
> > 
> > Reported-by: Laszlo Ersek <lersek@redhat.com>
> > Signed-off-by: Bandan Das <bsd@redhat.com>
> > ---
> >  linux-headers/linux/kvm.h | 1 +
> >  target-i386/kvm.c         | 6 ++++++
> >  2 files changed, 7 insertions(+)
> > 
> > diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
> > index 3bac873..6afad49 100644
> > --- a/linux-headers/linux/kvm.h
> > +++ b/linux-headers/linux/kvm.h
> > @@ -817,6 +817,7 @@ struct kvm_ppc_smmu_info {
> >  #define KVM_CAP_DISABLE_QUIRKS 116
> >  #define KVM_CAP_X86_SMM 117
> >  #define KVM_CAP_MULTI_ADDRESS_SPACE 118
> > +#define KVM_CAP_PHY_ADDR_WIDTH 119
> >  
> >  #ifdef KVM_CAP_IRQ_ROUTING
> >  
> > diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> > index 066d03d..66e3448 100644
> > --- a/target-i386/kvm.c
> > +++ b/target-i386/kvm.c
> > @@ -892,6 +892,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
> >      uint64_t shadow_mem;
> >      int ret;
> >      struct utsname utsname;
> > +    int max_phys_bits;
> >  
> >      ret = kvm_get_supported_msrs(s);
> >      if (ret < 0) {
> > @@ -945,6 +946,11 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
> >          }
> >      }
> >  
> > +    max_phys_bits = kvm_check_extension(s, KVM_CAP_PHY_ADDR_WIDTH);
> > +    if (max_phys_bits && (1ULL << max_phys_bits) <= ram_size)
> > +        fprintf(stderr, "Warning: The amount of memory assigned to the guest "
> > +            "is more than that supported by the host CPU(s). Guest may be unstable.\n");
> > +
> >      if (kvm_check_extension(s, KVM_CAP_X86_SMM)) {
> >          smram_machine_done.notify = register_smram_listener;
> >          qemu_add_machine_init_done_notifier(&smram_machine_done);
> > 
> 
> First, see my comments on the KVM patch.
> 
> Second, ram_size is not the right thing to compare. What should be
> checked is whether the highest guest-physical address that maps to RAM
> can be represented in the address width of the host processor (and only
> if EPT is enabled, but that sub-condition belongs to the KVM patch).
> 
> Note that this is not the same as the check written in the patch. For
> example, if you assume a 32-bit PCI hole with size 1 GB, then a total
> guest RAM of size 63 GB will result in the highest guest-phys memory
> address being 0xF_FFFF_FFFF, which just fits into 36 bits.
> 
> Correspondingly, the above code would not print the warning for
> 
>   -m $((63 * 1024 + 1))
> 
> on my laptop (which has "address sizes   : 36 bits physical, ..."), even
> though such a guest would not boot for me (with EPT enabled).
> 
> Please see
> 
> http://thread.gmane.org/gmane.comp.bios.tianocore.devel/15418/focus=15447
> 
> So, "ram_size" in the controlling expression should be replaced with
> "maximum_guest_ram_address" (which should be inclusive, and the <= relop
> should be preserved).
also with memory hotplug tuned on we should check if the end of
hotplug memory area is less then limit, i.e.:

  pcms->hotplug_memory.base + hotplug_mem_size < 1ULL << max_phys_bits

> 
> Thanks
> Laszlo
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Laszlo Ersek July 9, 2015, 10:03 a.m. UTC | #5
On 07/09/15 11:27, Igor Mammedov wrote:
> On Thu, 09 Jul 2015 09:02:38 +0200
> Laszlo Ersek <lersek@redhat.com> wrote:
> 
>> On 07/09/15 00:42, Bandan Das wrote:
>>>
>>> If a Linux guest is assigned more memory than is supported
>>> by the host processor, the guest is unable to boot. That
>>> is expected, however, there's no message indicating the user
>>> what went wrong. This change prints a message to stderr if
>>> KVM has the corresponding capability.
>>>
>>> Reported-by: Laszlo Ersek <lersek@redhat.com>
>>> Signed-off-by: Bandan Das <bsd@redhat.com>
>>> ---
>>>  linux-headers/linux/kvm.h | 1 +
>>>  target-i386/kvm.c         | 6 ++++++
>>>  2 files changed, 7 insertions(+)
>>>
>>> diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
>>> index 3bac873..6afad49 100644
>>> --- a/linux-headers/linux/kvm.h
>>> +++ b/linux-headers/linux/kvm.h
>>> @@ -817,6 +817,7 @@ struct kvm_ppc_smmu_info {
>>>  #define KVM_CAP_DISABLE_QUIRKS 116
>>>  #define KVM_CAP_X86_SMM 117
>>>  #define KVM_CAP_MULTI_ADDRESS_SPACE 118
>>> +#define KVM_CAP_PHY_ADDR_WIDTH 119
>>>  
>>>  #ifdef KVM_CAP_IRQ_ROUTING
>>>  
>>> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
>>> index 066d03d..66e3448 100644
>>> --- a/target-i386/kvm.c
>>> +++ b/target-i386/kvm.c
>>> @@ -892,6 +892,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>>>      uint64_t shadow_mem;
>>>      int ret;
>>>      struct utsname utsname;
>>> +    int max_phys_bits;
>>>  
>>>      ret = kvm_get_supported_msrs(s);
>>>      if (ret < 0) {
>>> @@ -945,6 +946,11 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>>>          }
>>>      }
>>>  
>>> +    max_phys_bits = kvm_check_extension(s, KVM_CAP_PHY_ADDR_WIDTH);
>>> +    if (max_phys_bits && (1ULL << max_phys_bits) <= ram_size)
>>> +        fprintf(stderr, "Warning: The amount of memory assigned to the guest "
>>> +            "is more than that supported by the host CPU(s). Guest may be unstable.\n");
>>> +
>>>      if (kvm_check_extension(s, KVM_CAP_X86_SMM)) {
>>>          smram_machine_done.notify = register_smram_listener;
>>>          qemu_add_machine_init_done_notifier(&smram_machine_done);
>>>
>>
>> First, see my comments on the KVM patch.
>>
>> Second, ram_size is not the right thing to compare. What should be
>> checked is whether the highest guest-physical address that maps to RAM
>> can be represented in the address width of the host processor (and only
>> if EPT is enabled, but that sub-condition belongs to the KVM patch).
>>
>> Note that this is not the same as the check written in the patch. For
>> example, if you assume a 32-bit PCI hole with size 1 GB, then a total
>> guest RAM of size 63 GB will result in the highest guest-phys memory
>> address being 0xF_FFFF_FFFF, which just fits into 36 bits.
>>
>> Correspondingly, the above code would not print the warning for
>>
>>   -m $((63 * 1024 + 1))
>>
>> on my laptop (which has "address sizes   : 36 bits physical, ..."), even
>> though such a guest would not boot for me (with EPT enabled).
>>
>> Please see
>>
>> http://thread.gmane.org/gmane.comp.bios.tianocore.devel/15418/focus=15447
>>
>> So, "ram_size" in the controlling expression should be replaced with
>> "maximum_guest_ram_address" (which should be inclusive, and the <= relop
>> should be preserved).
> also with memory hotplug tuned on we should check if the end of
> hotplug memory area is less then limit, i.e.:
> 
>   pcms->hotplug_memory.base + hotplug_mem_size < 1ULL << max_phys_bits

Seems reasonable, thanks for the hint!

(The LHS in this instance is exclusive though, so equality should *not*
trigger the warning. "maximum_guest_ram_address" is inclusive, and
equality should trigger the warning. (Although equality seems quite
impossible in practice.))

Thanks!
Laszlo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Igor Mammedov July 9, 2015, 12:51 p.m. UTC | #6
On Wed, 08 Jul 2015 18:42:01 -0400
Bandan Das <bsd@redhat.com> wrote:

> 
> If a Linux guest is assigned more memory than is supported
> by the host processor, the guest is unable to boot. That
> is expected, however, there's no message indicating the user
> what went wrong. This change prints a message to stderr if
> KVM has the corresponding capability.
> 
> Reported-by: Laszlo Ersek <lersek@redhat.com>
> Signed-off-by: Bandan Das <bsd@redhat.com>
> ---
>  linux-headers/linux/kvm.h | 1 +
>  target-i386/kvm.c         | 6 ++++++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
> index 3bac873..6afad49 100644
> --- a/linux-headers/linux/kvm.h
> +++ b/linux-headers/linux/kvm.h
> @@ -817,6 +817,7 @@ struct kvm_ppc_smmu_info {
>  #define KVM_CAP_DISABLE_QUIRKS 116
>  #define KVM_CAP_X86_SMM 117
>  #define KVM_CAP_MULTI_ADDRESS_SPACE 118
> +#define KVM_CAP_PHY_ADDR_WIDTH 119
>  
>  #ifdef KVM_CAP_IRQ_ROUTING
>  
> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> index 066d03d..66e3448 100644
> --- a/target-i386/kvm.c
> +++ b/target-i386/kvm.c
> @@ -892,6 +892,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>      uint64_t shadow_mem;
>      int ret;
>      struct utsname utsname;
> +    int max_phys_bits;
>  
>      ret = kvm_get_supported_msrs(s);
>      if (ret < 0) {
> @@ -945,6 +946,11 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>          }
>      }
>  
> +    max_phys_bits = kvm_check_extension(s, KVM_CAP_PHY_ADDR_WIDTH);
max_phys_bits seems generic enough and could be applied to other targets
as well.

making it a property of machine, would make accessing/manipulating it easier.
define default value for machine/TCG mode and when KVM is enabled
it would override/set its own limit.

then any board could easily access machine->max_gpa to make board specific
checks.

> +    if (max_phys_bits && (1ULL << max_phys_bits) <= ram_size)
> +        fprintf(stderr, "Warning: The amount of memory assigned to the guest "
> +            "is more than that supported by the host CPU(s). Guest may be unstable.\n");
> +
>      if (kvm_check_extension(s, KVM_CAP_X86_SMM)) {
>          smram_machine_done.notify = register_smram_listener;
>          qemu_add_machine_init_done_notifier(&smram_machine_done);

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paolo Bonzini July 9, 2015, 1:04 p.m. UTC | #7
On 09/07/2015 10:26, Laszlo Ersek wrote:
>> > 
>> > Perhaps KVM could simply hide memory above the limit (i.e. treat it as
>> > MMIO), and the BIOS could remove RAM above the limit from the e820
>> > memory map?
> I'd prefer to leave the guest firmware*s* out of this... :)
> 
> E820 is a legacy BIOS concept. In OVMF we'd have to hack the memory
> resource descriptor HOBs (which in turn control the DXE memory space
> map, which in turn controls the UEFI memory map). Those HOBs are
> currently based on what the CMOS reports about the RAM available under
> and above 4GB.
> 
> It's pretty complex already (will get more complex with SMM support),
> and TBH, for working around such an obscure issue, I wouldn't like to
> complicate it even further...
> 
> After all, this is a host platform limitation. The solution should be to
> either move to a more capable host, or do it in software (disable EPT).

The reason I mentioned the firmware is because you could in principle
have the same issue on real hardware - say putting 128 GB on your
laptop.  The firmware should cope with it.

If OVMF does not use etc/e820, it can instead hack the values it reads
from CMOS, bounding them according to the CPUID value.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bandan Das July 9, 2015, 7:11 p.m. UTC | #8
Laszlo Ersek <lersek@redhat.com> writes:
...
>>>
>>> First, see my comments on the KVM patch.
>>>
>>> Second, ram_size is not the right thing to compare. What should be
>>> checked is whether the highest guest-physical address that maps to RAM
>>> can be represented in the address width of the host processor (and only
>>> if EPT is enabled, but that sub-condition belongs to the KVM patch).
>>>
>>> Note that this is not the same as the check written in the patch. For
>>> example, if you assume a 32-bit PCI hole with size 1 GB, then a total
>>> guest RAM of size 63 GB will result in the highest guest-phys memory
>>> address being 0xF_FFFF_FFFF, which just fits into 36 bits.
>>>
>>> Correspondingly, the above code would not print the warning for
>>>
>>>   -m $((63 * 1024 + 1))
>>>
>>> on my laptop (which has "address sizes   : 36 bits physical, ..."), even
>>> though such a guest would not boot for me (with EPT enabled).
>>>
>>> Please see
>>>
>>> http://thread.gmane.org/gmane.comp.bios.tianocore.devel/15418/focus=15447
>>>
>>> So, "ram_size" in the controlling expression should be replaced with
>>> "maximum_guest_ram_address" (which should be inclusive, and the <= relop
>>> should be preserved).
>> also with memory hotplug tuned on we should check if the end of
>> hotplug memory area is less then limit, i.e.:
>> 
>>   pcms->hotplug_memory.base + hotplug_mem_size < 1ULL << max_phys_bits
>
> Seems reasonable, thanks for the hint!

Thanks Igor and Laszlo, makes sense. I am wondering if this 1GB PCI
hole is always fixed so that I can simply include that in calculating the maximum
guest ram address ? Or do we have to figure that out every time ?

> (The LHS in this instance is exclusive though, so equality should *not*
> trigger the warning. "maximum_guest_ram_address" is inclusive, and
> equality should trigger the warning. (Although equality seems quite
> impossible in practice.))
>
> Thanks!
> Laszlo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bandan Das July 9, 2015, 7:22 p.m. UTC | #9
Paolo Bonzini <pbonzini@redhat.com> writes:

> On 09/07/2015 10:26, Laszlo Ersek wrote:
>>> > 
>>> > Perhaps KVM could simply hide memory above the limit (i.e. treat it as
>>> > MMIO), and the BIOS could remove RAM above the limit from the e820
>>> > memory map?
>> I'd prefer to leave the guest firmware*s* out of this... :)
>> 
>> E820 is a legacy BIOS concept. In OVMF we'd have to hack the memory
>> resource descriptor HOBs (which in turn control the DXE memory space
>> map, which in turn controls the UEFI memory map). Those HOBs are
>> currently based on what the CMOS reports about the RAM available under
>> and above 4GB.
>> 
>> It's pretty complex already (will get more complex with SMM support),
>> and TBH, for working around such an obscure issue, I wouldn't like to
>> complicate it even further...
>> 
>> After all, this is a host platform limitation. The solution should be to
>> either move to a more capable host, or do it in software (disable EPT).
>
> The reason I mentioned the firmware is because you could in principle
> have the same issue on real hardware - say putting 128 GB on your
> laptop.  The firmware should cope with it.

Agreed, it's probably not a good idea to deviate too much from how real
hardware would behave IMO. As a simplification of Paolo's idea, is it
possible for qemu to completely ignore memory above the limit ? Will
that break anything ? :)

> If OVMF does not use etc/e820, it can instead hack the values it reads
> from CMOS, bounding them according to the CPUID value.
>
> Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bandan Das July 9, 2015, 7:25 p.m. UTC | #10
Igor Mammedov <imammedo@redhat.com> writes:

> On Wed, 08 Jul 2015 18:42:01 -0400
> Bandan Das <bsd@redhat.com> wrote:
>
>> 
>> If a Linux guest is assigned more memory than is supported
>> by the host processor, the guest is unable to boot. That
>> is expected, however, there's no message indicating the user
>> what went wrong. This change prints a message to stderr if
>> KVM has the corresponding capability.
>> 
>> Reported-by: Laszlo Ersek <lersek@redhat.com>
>> Signed-off-by: Bandan Das <bsd@redhat.com>
>> ---
>>  linux-headers/linux/kvm.h | 1 +
>>  target-i386/kvm.c         | 6 ++++++
>>  2 files changed, 7 insertions(+)
>> 
>> diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
>> index 3bac873..6afad49 100644
>> --- a/linux-headers/linux/kvm.h
>> +++ b/linux-headers/linux/kvm.h
>> @@ -817,6 +817,7 @@ struct kvm_ppc_smmu_info {
>>  #define KVM_CAP_DISABLE_QUIRKS 116
>>  #define KVM_CAP_X86_SMM 117
>>  #define KVM_CAP_MULTI_ADDRESS_SPACE 118
>> +#define KVM_CAP_PHY_ADDR_WIDTH 119
>>  
>>  #ifdef KVM_CAP_IRQ_ROUTING
>>  
>> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
>> index 066d03d..66e3448 100644
>> --- a/target-i386/kvm.c
>> +++ b/target-i386/kvm.c
>> @@ -892,6 +892,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>>      uint64_t shadow_mem;
>>      int ret;
>>      struct utsname utsname;
>> +    int max_phys_bits;
>>  
>>      ret = kvm_get_supported_msrs(s);
>>      if (ret < 0) {
>> @@ -945,6 +946,11 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>>          }
>>      }
>>  
>> +    max_phys_bits = kvm_check_extension(s, KVM_CAP_PHY_ADDR_WIDTH);
> max_phys_bits seems generic enough and could be applied to other targets
> as well.

I am a little clueless about other targets but is figuring this out from userspace
as simple as it's on x86 (cpuid)? If not, then I agree, this could be made a generic
value.

Bandan

> making it a property of machine, would make accessing/manipulating it easier.
> define default value for machine/TCG mode and when KVM is enabled
> it would override/set its own limit.
>
> then any board could easily access machine->max_gpa to make board specific
> checks.
>
>> +    if (max_phys_bits && (1ULL << max_phys_bits) <= ram_size)
>> +        fprintf(stderr, "Warning: The amount of memory assigned to the guest "
>> +            "is more than that supported by the host CPU(s). Guest may be unstable.\n");
>> +
>>      if (kvm_check_extension(s, KVM_CAP_X86_SMM)) {
>>          smram_machine_done.notify = register_smram_listener;
>>          qemu_add_machine_init_done_notifier(&smram_machine_done);
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Laszlo Ersek July 9, 2015, 7:30 p.m. UTC | #11
On 07/09/15 21:11, Bandan Das wrote:
> Laszlo Ersek <lersek@redhat.com> writes:
> ...
>>>>
>>>> First, see my comments on the KVM patch.
>>>>
>>>> Second, ram_size is not the right thing to compare. What should be
>>>> checked is whether the highest guest-physical address that maps to RAM
>>>> can be represented in the address width of the host processor (and only
>>>> if EPT is enabled, but that sub-condition belongs to the KVM patch).
>>>>
>>>> Note that this is not the same as the check written in the patch. For
>>>> example, if you assume a 32-bit PCI hole with size 1 GB, then a total
>>>> guest RAM of size 63 GB will result in the highest guest-phys memory
>>>> address being 0xF_FFFF_FFFF, which just fits into 36 bits.
>>>>
>>>> Correspondingly, the above code would not print the warning for
>>>>
>>>>   -m $((63 * 1024 + 1))
>>>>
>>>> on my laptop (which has "address sizes   : 36 bits physical, ..."), even
>>>> though such a guest would not boot for me (with EPT enabled).
>>>>
>>>> Please see
>>>>
>>>> http://thread.gmane.org/gmane.comp.bios.tianocore.devel/15418/focus=15447
>>>>
>>>> So, "ram_size" in the controlling expression should be replaced with
>>>> "maximum_guest_ram_address" (which should be inclusive, and the <= relop
>>>> should be preserved).
>>> also with memory hotplug tuned on we should check if the end of
>>> hotplug memory area is less then limit, i.e.:
>>>
>>>   pcms->hotplug_memory.base + hotplug_mem_size < 1ULL << max_phys_bits
>>
>> Seems reasonable, thanks for the hint!
> 
> Thanks Igor and Laszlo, makes sense. I am wondering if this 1GB PCI
> hole is always fixed so that I can simply include that in calculating the maximum
> guest ram address ? Or do we have to figure that out every time ?

Please grep the tree for "above_4g_mem_size". The size of the 32-bit PCI
hole is not constant, but all the necessary computation goes into
"above_4g_mem_size" already.

So I think you should derive the max possible gpa from
"above_4g_mem_size" and the top of the hotpluggable memory area, and
compare that against the PCPU address width, *if* EPT is enabled.

(BTW "pcms->hotplug_memory.base" depends on "above_4g_mem_size" too.)

Thanks
Laszlo

> 
>> (The LHS in this instance is exclusive though, so equality should *not*
>> trigger the warning. "maximum_guest_ram_address" is inclusive, and
>> equality should trigger the warning. (Although equality seems quite
>> impossible in practice.))
>>
>> Thanks!
>> Laszlo

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 3bac873..6afad49 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -817,6 +817,7 @@  struct kvm_ppc_smmu_info {
 #define KVM_CAP_DISABLE_QUIRKS 116
 #define KVM_CAP_X86_SMM 117
 #define KVM_CAP_MULTI_ADDRESS_SPACE 118
+#define KVM_CAP_PHY_ADDR_WIDTH 119
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 066d03d..66e3448 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -892,6 +892,7 @@  int kvm_arch_init(MachineState *ms, KVMState *s)
     uint64_t shadow_mem;
     int ret;
     struct utsname utsname;
+    int max_phys_bits;
 
     ret = kvm_get_supported_msrs(s);
     if (ret < 0) {
@@ -945,6 +946,11 @@  int kvm_arch_init(MachineState *ms, KVMState *s)
         }
     }
 
+    max_phys_bits = kvm_check_extension(s, KVM_CAP_PHY_ADDR_WIDTH);
+    if (max_phys_bits && (1ULL << max_phys_bits) <= ram_size)
+        fprintf(stderr, "Warning: The amount of memory assigned to the guest "
+            "is more than that supported by the host CPU(s). Guest may be unstable.\n");
+
     if (kvm_check_extension(s, KVM_CAP_X86_SMM)) {
         smram_machine_done.notify = register_smram_listener;
         qemu_add_machine_init_done_notifier(&smram_machine_done);