diff mbox series

[v1,10/14] s390x/pv: check initial, not maximum RAM size

Message ID 20240910175809.2135596-11-david@redhat.com (mailing list archive)
State New, archived
Headers show
Series s390x: virtio-mem support | expand

Commit Message

David Hildenbrand Sept. 10, 2024, 5:58 p.m. UTC
We actually want to check the available RAM, not the maximum RAM size.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/kvm/pv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Nina Schoetterl-Glausch Sept. 24, 2024, 4:22 p.m. UTC | #1
On Tue, 2024-09-10 at 19:58 +0200, David Hildenbrand wrote:
> We actually want to check the available RAM, not the maximum RAM size.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Nit below.
> ---
>  target/s390x/kvm/pv.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/s390x/kvm/pv.c b/target/s390x/kvm/pv.c
> index dde836d21a..424cce75ca 100644
> --- a/target/s390x/kvm/pv.c
> +++ b/target/s390x/kvm/pv.c
> @@ -133,7 +133,7 @@ bool s390_pv_vm_try_disable_async(S390CcwMachineState *ms)
>       * If the feature is not present or if the VM is not larger than 2 GiB,
>       * KVM_PV_ASYNC_CLEANUP_PREPARE fill fail; no point in attempting it.
>       */
> -    if ((MACHINE(ms)->maxram_size <= 2 * GiB) ||
> +    if ((MACHINE(ms)->ram_size <= 2 * GiB) ||
>          !kvm_check_extension(kvm_state, KVM_CAP_S390_PROTECTED_ASYNC_DISABLE)) {
>          return false;
>      }

If I understood the kernel code right, the decision is made wrt
the size of the gmap address space, which is the same as the
limit set for the VM. So using s390_get_memory_limit would be
semantically cleaner.
David Hildenbrand Sept. 24, 2024, 8:17 p.m. UTC | #2
On 24.09.24 18:22, Nina Schoetterl-Glausch wrote:
> On Tue, 2024-09-10 at 19:58 +0200, David Hildenbrand wrote:
>> We actually want to check the available RAM, not the maximum RAM size.
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
> Nit below.
>> ---
>>   target/s390x/kvm/pv.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/target/s390x/kvm/pv.c b/target/s390x/kvm/pv.c
>> index dde836d21a..424cce75ca 100644
>> --- a/target/s390x/kvm/pv.c
>> +++ b/target/s390x/kvm/pv.c
>> @@ -133,7 +133,7 @@ bool s390_pv_vm_try_disable_async(S390CcwMachineState *ms)
>>        * If the feature is not present or if the VM is not larger than 2 GiB,
>>        * KVM_PV_ASYNC_CLEANUP_PREPARE fill fail; no point in attempting it.
>>        */
>> -    if ((MACHINE(ms)->maxram_size <= 2 * GiB) ||
>> +    if ((MACHINE(ms)->ram_size <= 2 * GiB) ||
>>           !kvm_check_extension(kvm_state, KVM_CAP_S390_PROTECTED_ASYNC_DISABLE)) {
>>           return false;
>>       }
> 
> If I understood the kernel code right, the decision is made wrt
> the size of the gmap address space, which is the same as the
> limit set for the VM. So using s390_get_memory_limit would be
> semantically cleaner.

I wonder if we should just drop the RAM size check. Not convinced the 
slightly faster reboot for such small VMs is really relevant? Makes the 
code more complicated than really necessary.
David Hildenbrand Sept. 26, 2024, 9:04 a.m. UTC | #3
On 24.09.24 22:17, David Hildenbrand wrote:
> On 24.09.24 18:22, Nina Schoetterl-Glausch wrote:
>> On Tue, 2024-09-10 at 19:58 +0200, David Hildenbrand wrote:
>>> We actually want to check the available RAM, not the maximum RAM size.
>>>
>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>
>> Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
>> Nit below.
>>> ---
>>>    target/s390x/kvm/pv.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/target/s390x/kvm/pv.c b/target/s390x/kvm/pv.c
>>> index dde836d21a..424cce75ca 100644
>>> --- a/target/s390x/kvm/pv.c
>>> +++ b/target/s390x/kvm/pv.c
>>> @@ -133,7 +133,7 @@ bool s390_pv_vm_try_disable_async(S390CcwMachineState *ms)
>>>         * If the feature is not present or if the VM is not larger than 2 GiB,
>>>         * KVM_PV_ASYNC_CLEANUP_PREPARE fill fail; no point in attempting it.
>>>         */
>>> -    if ((MACHINE(ms)->maxram_size <= 2 * GiB) ||
>>> +    if ((MACHINE(ms)->ram_size <= 2 * GiB) ||
>>>            !kvm_check_extension(kvm_state, KVM_CAP_S390_PROTECTED_ASYNC_DISABLE)) {
>>>            return false;
>>>        }
>>
>> If I understood the kernel code right, the decision is made wrt
>> the size of the gmap address space, which is the same as the
>> limit set for the VM. So using s390_get_memory_limit would be
>> semantically cleaner.
> 
> I wonder if we should just drop the RAM size check. Not convinced the
> slightly faster reboot for such small VMs is really relevant? Makes the
> code more complicated than really necessary.

Thinking about it,

diff --git a/target/s390x/kvm/pv.c b/target/s390x/kvm/pv.c
index 424cce75ca..bbb2108546 100644
--- a/target/s390x/kvm/pv.c
+++ b/target/s390x/kvm/pv.c
@@ -133,7 +133,7 @@ bool s390_pv_vm_try_disable_async(S390CcwMachineState *ms)
       * If the feature is not present or if the VM is not larger than 2 GiB,
       * KVM_PV_ASYNC_CLEANUP_PREPARE fill fail; no point in attempting it.
       */
-    if ((MACHINE(ms)->ram_size <= 2 * GiB) ||
+    if (s390_get_memory_limit(ms) < 2 * GiB ||
          !kvm_check_extension(kvm_state, KVM_CAP_S390_PROTECTED_ASYNC_DISABLE)) {
          return false;
      }

Is probably the easiest change, thanks!
Christian Borntraeger Sept. 30, 2024, 11:15 a.m. UTC | #4
Am 24.09.24 um 22:17 schrieb David Hildenbrand:
> On 24.09.24 18:22, Nina Schoetterl-Glausch wrote:
>> On Tue, 2024-09-10 at 19:58 +0200, David Hildenbrand wrote:
>>> We actually want to check the available RAM, not the maximum RAM size.
>>>
>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>
>> Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
>> Nit below.
>>> ---
>>>   target/s390x/kvm/pv.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/target/s390x/kvm/pv.c b/target/s390x/kvm/pv.c
>>> index dde836d21a..424cce75ca 100644
>>> --- a/target/s390x/kvm/pv.c
>>> +++ b/target/s390x/kvm/pv.c
>>> @@ -133,7 +133,7 @@ bool s390_pv_vm_try_disable_async(S390CcwMachineState *ms)
>>>        * If the feature is not present or if the VM is not larger than 2 GiB,
>>>        * KVM_PV_ASYNC_CLEANUP_PREPARE fill fail; no point in attempting it.
>>>        */
>>> -    if ((MACHINE(ms)->maxram_size <= 2 * GiB) ||
>>> +    if ((MACHINE(ms)->ram_size <= 2 * GiB) ||
>>>           !kvm_check_extension(kvm_state, KVM_CAP_S390_PROTECTED_ASYNC_DISABLE)) {
>>>           return false;
>>>       }
>>
>> If I understood the kernel code right, the decision is made wrt
>> the size of the gmap address space, which is the same as the
>> limit set for the VM. So using s390_get_memory_limit would be
>> semantically cleaner.
> 
> I wonder if we should just drop the RAM size check. Not convinced the slightly faster reboot for such small VMs is really relevant? Makes the code more complicated than really necessary.

IIRC there have been functional issues with small guests and asnyc. Claudio, do you remember?
Claudio Imbrenda Sept. 30, 2024, 11:37 a.m. UTC | #5
On Mon, 30 Sep 2024 13:15:52 +0200
Christian Borntraeger <borntraeger@linux.ibm.com> wrote:

> Am 24.09.24 um 22:17 schrieb David Hildenbrand:
> > On 24.09.24 18:22, Nina Schoetterl-Glausch wrote:  
> >> On Tue, 2024-09-10 at 19:58 +0200, David Hildenbrand wrote:  
> >>> We actually want to check the available RAM, not the maximum RAM size.
> >>>
> >>> Signed-off-by: David Hildenbrand <david@redhat.com>  
> >>
> >> Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
> >> Nit below.  
> >>> ---
> >>>   target/s390x/kvm/pv.c | 2 +-
> >>>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/target/s390x/kvm/pv.c b/target/s390x/kvm/pv.c
> >>> index dde836d21a..424cce75ca 100644
> >>> --- a/target/s390x/kvm/pv.c
> >>> +++ b/target/s390x/kvm/pv.c
> >>> @@ -133,7 +133,7 @@ bool s390_pv_vm_try_disable_async(S390CcwMachineState *ms)
> >>>        * If the feature is not present or if the VM is not larger than 2 GiB,
> >>>        * KVM_PV_ASYNC_CLEANUP_PREPARE fill fail; no point in attempting it.
> >>>        */
> >>> -    if ((MACHINE(ms)->maxram_size <= 2 * GiB) ||
> >>> +    if ((MACHINE(ms)->ram_size <= 2 * GiB) ||
> >>>           !kvm_check_extension(kvm_state, KVM_CAP_S390_PROTECTED_ASYNC_DISABLE)) {
> >>>           return false;
> >>>       }  
> >>
> >> If I understood the kernel code right, the decision is made wrt
> >> the size of the gmap address space, which is the same as the
> >> limit set for the VM. So using s390_get_memory_limit would be
> >> semantically cleaner.  
> > 
> > I wonder if we should just drop the RAM size check. Not convinced the slightly faster reboot for such small VMs is really relevant? Makes the code more complicated than really necessary.  
> 
> IIRC there have been functional issues with small guests and asnyc. Claudio, do you remember?

if we are <2G, KVM allocates a segment table as the highest level table
for the gmap ASCE. there are pointers lurking around in the reverse
mapping prefix_tree, which point directly into segment tables.

if the ASCE is region3 or higher, that's not an issue. if it's a
segment table, then it's an issue, because those pointers will end up
pointing into freed memory, once the old asce is freed.

in short, we have to guarantee that we will never set aside a gmap ASCE
if it is a segment table.
David Hildenbrand Sept. 30, 2024, 1:14 p.m. UTC | #6
On 30.09.24 13:37, Claudio Imbrenda wrote:
> On Mon, 30 Sep 2024 13:15:52 +0200
> Christian Borntraeger <borntraeger@linux.ibm.com> wrote:
> 
>> Am 24.09.24 um 22:17 schrieb David Hildenbrand:
>>> On 24.09.24 18:22, Nina Schoetterl-Glausch wrote:
>>>> On Tue, 2024-09-10 at 19:58 +0200, David Hildenbrand wrote:
>>>>> We actually want to check the available RAM, not the maximum RAM size.
>>>>>
>>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>>>
>>>> Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
>>>> Nit below.
>>>>> ---
>>>>>    target/s390x/kvm/pv.c | 2 +-
>>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/target/s390x/kvm/pv.c b/target/s390x/kvm/pv.c
>>>>> index dde836d21a..424cce75ca 100644
>>>>> --- a/target/s390x/kvm/pv.c
>>>>> +++ b/target/s390x/kvm/pv.c
>>>>> @@ -133,7 +133,7 @@ bool s390_pv_vm_try_disable_async(S390CcwMachineState *ms)
>>>>>         * If the feature is not present or if the VM is not larger than 2 GiB,
>>>>>         * KVM_PV_ASYNC_CLEANUP_PREPARE fill fail; no point in attempting it.
>>>>>         */
>>>>> -    if ((MACHINE(ms)->maxram_size <= 2 * GiB) ||
>>>>> +    if ((MACHINE(ms)->ram_size <= 2 * GiB) ||
>>>>>            !kvm_check_extension(kvm_state, KVM_CAP_S390_PROTECTED_ASYNC_DISABLE)) {
>>>>>            return false;
>>>>>        }
>>>>
>>>> If I understood the kernel code right, the decision is made wrt
>>>> the size of the gmap address space, which is the same as the
>>>> limit set for the VM. So using s390_get_memory_limit would be
>>>> semantically cleaner.
>>>
>>> I wonder if we should just drop the RAM size check. Not convinced the slightly faster reboot for such small VMs is really relevant? Makes the code more complicated than really necessary.
>>
>> IIRC there have been functional issues with small guests and asnyc. Claudio, do you remember?
> 
> if we are <2G, KVM allocates a segment table as the highest level table
> for the gmap ASCE. there are pointers lurking around in the reverse
> mapping prefix_tree, which point directly into segment tables.
> 
> if the ASCE is region3 or higher, that's not an issue. if it's a
> segment table, then it's an issue, because those pointers will end up
> pointing into freed memory, once the old asce is freed.
> 
> in short, we have to guarantee that we will never set aside a gmap ASCE
> if it is a segment table.

Thanks for the details, the kernel seems to properly safeguard against 
that. For now, I'll turn it into a check against the memory limit, which 
directly translates to the gmap ASCE used.

Thanks!
Claudio Imbrenda Sept. 30, 2024, 1:26 p.m. UTC | #7
On Mon, 30 Sep 2024 15:14:52 +0200
David Hildenbrand <david@redhat.com> wrote:

> On 30.09.24 13:37, Claudio Imbrenda wrote:
> > On Mon, 30 Sep 2024 13:15:52 +0200
> > Christian Borntraeger <borntraeger@linux.ibm.com> wrote:
> >   
> >> Am 24.09.24 um 22:17 schrieb David Hildenbrand:  
> >>> On 24.09.24 18:22, Nina Schoetterl-Glausch wrote:  
> >>>> On Tue, 2024-09-10 at 19:58 +0200, David Hildenbrand wrote:  
> >>>>> We actually want to check the available RAM, not the maximum RAM size.
> >>>>>
> >>>>> Signed-off-by: David Hildenbrand <david@redhat.com>  
> >>>>
> >>>> Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
> >>>> Nit below.  
> >>>>> ---
> >>>>>    target/s390x/kvm/pv.c | 2 +-
> >>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/target/s390x/kvm/pv.c b/target/s390x/kvm/pv.c
> >>>>> index dde836d21a..424cce75ca 100644
> >>>>> --- a/target/s390x/kvm/pv.c
> >>>>> +++ b/target/s390x/kvm/pv.c
> >>>>> @@ -133,7 +133,7 @@ bool s390_pv_vm_try_disable_async(S390CcwMachineState *ms)
> >>>>>         * If the feature is not present or if the VM is not larger than 2 GiB,
> >>>>>         * KVM_PV_ASYNC_CLEANUP_PREPARE fill fail; no point in attempting it.
> >>>>>         */
> >>>>> -    if ((MACHINE(ms)->maxram_size <= 2 * GiB) ||
> >>>>> +    if ((MACHINE(ms)->ram_size <= 2 * GiB) ||
> >>>>>            !kvm_check_extension(kvm_state, KVM_CAP_S390_PROTECTED_ASYNC_DISABLE)) {
> >>>>>            return false;
> >>>>>        }  
> >>>>
> >>>> If I understood the kernel code right, the decision is made wrt
> >>>> the size of the gmap address space, which is the same as the
> >>>> limit set for the VM. So using s390_get_memory_limit would be
> >>>> semantically cleaner.  
> >>>
> >>> I wonder if we should just drop the RAM size check. Not convinced the slightly faster reboot for such small VMs is really relevant? Makes the code more complicated than really necessary.  
> >>
> >> IIRC there have been functional issues with small guests and asnyc. Claudio, do you remember?  
> > 
> > if we are <2G, KVM allocates a segment table as the highest level table
> > for the gmap ASCE. there are pointers lurking around in the reverse
> > mapping prefix_tree, which point directly into segment tables.
> > 
> > if the ASCE is region3 or higher, that's not an issue. if it's a
> > segment table, then it's an issue, because those pointers will end up
> > pointing into freed memory, once the old asce is freed.
> > 
> > in short, we have to guarantee that we will never set aside a gmap ASCE
> > if it is a segment table.  
> 
> Thanks for the details, the kernel seems to properly safeguard against 

it does, but it returns an error if userspace tries; the check there is
to prevent qemu from emitting confusing error messages.

> that. For now, I'll turn it into a check against the memory limit, which 
> directly translates to the gmap ASCE used.

sounds good

> 
> Thanks!

no problem :)
diff mbox series

Patch

diff --git a/target/s390x/kvm/pv.c b/target/s390x/kvm/pv.c
index dde836d21a..424cce75ca 100644
--- a/target/s390x/kvm/pv.c
+++ b/target/s390x/kvm/pv.c
@@ -133,7 +133,7 @@  bool s390_pv_vm_try_disable_async(S390CcwMachineState *ms)
      * If the feature is not present or if the VM is not larger than 2 GiB,
      * KVM_PV_ASYNC_CLEANUP_PREPARE fill fail; no point in attempting it.
      */
-    if ((MACHINE(ms)->maxram_size <= 2 * GiB) ||
+    if ((MACHINE(ms)->ram_size <= 2 * GiB) ||
         !kvm_check_extension(kvm_state, KVM_CAP_S390_PROTECTED_ASYNC_DISABLE)) {
         return false;
     }