diff mbox series

[v9,3/3] cpus-common: implement dirty page limit on vCPU

Message ID 9cc3cc5377e4330cbe0e87e89f452889516a4c09.1638495274.git.huangy81@chinatelecom.cn (mailing list archive)
State New, archived
Headers show
Series support dirty restraint on vCPU | expand

Commit Message

Hyman Huang Dec. 3, 2021, 1:39 a.m. UTC
From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

Implement dirtyrate calculation periodically basing on
dirty-ring and throttle vCPU until it reachs the quota
dirty page rate given by user.

Introduce qmp commands "vcpu-dirty-limit", "query-vcpu-dirty-limit"
to enable, disable, query dirty page limit for virtual CPU.

Meanwhile, introduce corresponding hmp commands "vcpu_dirty_limit",
"info vcpu_dirty_limit" so developers can play with them easier.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
---
 cpus-common.c         | 149 ++++++++++++++++++++++++++++++++++++++++++++++++++
 hmp-commands-info.hx  |  13 +++++
 hmp-commands.hx       |  16 ++++++
 include/hw/core/cpu.h |   9 +++
 include/monitor/hmp.h |   2 +
 qapi/migration.json   |  48 ++++++++++++++++
 softmmu/vl.c          |   1 +
 7 files changed, 238 insertions(+)

Comments

Markus Armbruster Dec. 3, 2021, 12:34 p.m. UTC | #1
huangy81@chinatelecom.cn writes:

> From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
>
> Implement dirtyrate calculation periodically basing on
> dirty-ring and throttle vCPU until it reachs the quota
> dirty page rate given by user.
>
> Introduce qmp commands "vcpu-dirty-limit", "query-vcpu-dirty-limit"
> to enable, disable, query dirty page limit for virtual CPU.
>
> Meanwhile, introduce corresponding hmp commands "vcpu_dirty_limit",
> "info vcpu_dirty_limit" so developers can play with them easier.
>
> Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

[...]

I see you replaced the interface.  Back to square one...

> diff --git a/qapi/migration.json b/qapi/migration.json
> index 3da8fdf..dc15b3f 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -1872,6 +1872,54 @@
>              'current-rate': 'int64' } }
>  
>  ##
> +# @vcpu-dirty-limit:
> +#
> +# Set or cancel the upper limit of dirty page rate for a virtual CPU.
> +#
> +# Requires KVM with accelerator property "dirty-ring-size" set.
> +# A virtual CPU's dirty page rate is a measure of its memory load.
> +# To observe dirty page rates, use @calc-dirty-rate.
> +#
> +# @cpu-index: index of virtual CPU.
> +#
> +# @enable: true to enable, false to disable.
> +#
> +# @dirty-rate: upper limit of dirty page rate for virtual CPU.
> +#
> +# Since: 7.0
> +#
> +# Example:
> +#   {"execute": "vcpu-dirty-limit"}
> +#    "arguments": { "cpu-index": 0,
> +#                   "enable": true,
> +#                   "dirty-rate": 200 } }
> +#
> +##
> +{ 'command': 'vcpu-dirty-limit',
> +  'data': { 'cpu-index': 'int',
> +            'enable': 'bool',
> +            'dirty-rate': 'uint64'} }

When @enable is false, @dirty-rate makes no sense and is not used (I
checked the code), but users have to specify it anyway.  That's bad
design.

Better: drop @enable, make @dirty-rate optional, present means enable,
absent means disable.

> +
> +##
> +# @query-vcpu-dirty-limit:
> +#
> +# Returns information about the virtual CPU dirty limit status.
> +#
> +# @cpu-index: index of the virtual CPU to query, if not specified, all
> +#             virtual CPUs will be queried.
> +#
> +# Since: 7.0
> +#
> +# Example:
> +#   {"execute": "query-vcpu-dirty-limit"}
> +#    "arguments": { "cpu-index": 0 } }
> +#
> +##
> +{ 'command': 'query-vcpu-dirty-limit',
> +  'data': { '*cpu-index': 'int' },
> +  'returns': [ 'DirtyLimitInfo' ] }

Why would anyone ever want to specify @cpu-index?  Output isn't that
large even if you have a few hundred CPUs.

Let's keep things simple and drop the parameter.

> +
> +##
>  # @snapshot-save:
>  #
>  # Save a VM snapshot
> diff --git a/softmmu/vl.c b/softmmu/vl.c
> index 620a1f1..0f83ce3 100644
> --- a/softmmu/vl.c
> +++ b/softmmu/vl.c
> @@ -3777,5 +3777,6 @@ void qemu_init(int argc, char **argv, char **envp)
>      qemu_init_displays();
>      accel_setup_post(current_machine);
>      os_setup_post();
> +    dirtylimit_setup(current_machine->smp.max_cpus);
>      resume_mux_open();
>  }
Hyman Huang Dec. 4, 2021, noon UTC | #2
在 2021/12/3 20:34, Markus Armbruster 写道:
> huangy81@chinatelecom.cn writes:
> 
>> From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
>>
>> Implement dirtyrate calculation periodically basing on
>> dirty-ring and throttle vCPU until it reachs the quota
>> dirty page rate given by user.
>>
>> Introduce qmp commands "vcpu-dirty-limit", "query-vcpu-dirty-limit"
>> to enable, disable, query dirty page limit for virtual CPU.
>>
>> Meanwhile, introduce corresponding hmp commands "vcpu_dirty_limit",
>> "info vcpu_dirty_limit" so developers can play with them easier.
>>
>> Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
> 
> [...]
> 
> I see you replaced the interface.  Back to square one...
> 
>> diff --git a/qapi/migration.json b/qapi/migration.json
>> index 3da8fdf..dc15b3f 100644
>> --- a/qapi/migration.json
>> +++ b/qapi/migration.json
>> @@ -1872,6 +1872,54 @@
>>               'current-rate': 'int64' } }
>>   
>>   ##
>> +# @vcpu-dirty-limit:
>> +#
>> +# Set or cancel the upper limit of dirty page rate for a virtual CPU.
>> +#
>> +# Requires KVM with accelerator property "dirty-ring-size" set.
>> +# A virtual CPU's dirty page rate is a measure of its memory load.
>> +# To observe dirty page rates, use @calc-dirty-rate.
>> +#
>> +# @cpu-index: index of virtual CPU.
>> +#
>> +# @enable: true to enable, false to disable.
>> +#
>> +# @dirty-rate: upper limit of dirty page rate for virtual CPU.
>> +#
>> +# Since: 7.0
>> +#
>> +# Example:
>> +#   {"execute": "vcpu-dirty-limit"}
>> +#    "arguments": { "cpu-index": 0,
>> +#                   "enable": true,
>> +#                   "dirty-rate": 200 } }
>> +#
>> +##
>> +{ 'command': 'vcpu-dirty-limit',
>> +  'data': { 'cpu-index': 'int',
>> +            'enable': 'bool',
>> +            'dirty-rate': 'uint64'} }
> 
> When @enable is false, @dirty-rate makes no sense and is not used (I
> checked the code), but users have to specify it anyway.  That's bad
> design.
> 
> Better: drop @enable, make @dirty-rate optional, present means enable,
> absent means disable.
Uh, if we drop @enable, enabling dirty limit should be like:
vcpu-dirty-limit cpu-index=0 dirty-rate=1000

And disabling dirty limit like:
vcpu-dirty-limit cpu-index=0

For disabling case, there is no hint of disabling in command 
"vcpu-dirty-limit".

How about make @dirty-rate optional, when enable dirty limit, it should
present, ignored otherwise?

> 
>> +
>> +##
>> +# @query-vcpu-dirty-limit:
>> +#
>> +# Returns information about the virtual CPU dirty limit status.
>> +#
>> +# @cpu-index: index of the virtual CPU to query, if not specified, all
>> +#             virtual CPUs will be queried.
>> +#
>> +# Since: 7.0
>> +#
>> +# Example:
>> +#   {"execute": "query-vcpu-dirty-limit"}
>> +#    "arguments": { "cpu-index": 0 } }
>> +#
>> +##
>> +{ 'command': 'query-vcpu-dirty-limit',
>> +  'data': { '*cpu-index': 'int' },
>> +  'returns': [ 'DirtyLimitInfo' ] }
> 
> Why would anyone ever want to specify @cpu-index?  Output isn't that
> large even if you have a few hundred CPUs.
> 
> Let's keep things simple and drop the parameter.
Ok, this make things simple.
> 
>> +
>> +##
>>   # @snapshot-save:
>>   #
>>   # Save a VM snapshot
>> diff --git a/softmmu/vl.c b/softmmu/vl.c
>> index 620a1f1..0f83ce3 100644
>> --- a/softmmu/vl.c
>> +++ b/softmmu/vl.c
>> @@ -3777,5 +3777,6 @@ void qemu_init(int argc, char **argv, char **envp)
>>       qemu_init_displays();
>>       accel_setup_post(current_machine);
>>       os_setup_post();
>> +    dirtylimit_setup(current_machine->smp.max_cpus);
>>       resume_mux_open();
>>   }
>
Peter Xu Dec. 6, 2021, 8:28 a.m. UTC | #3
On Sat, Dec 04, 2021 at 08:00:19PM +0800, Hyman Huang wrote:
> 
> 
> 在 2021/12/3 20:34, Markus Armbruster 写道:
> > huangy81@chinatelecom.cn writes:
> > 
> > > From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
> > > 
> > > Implement dirtyrate calculation periodically basing on
> > > dirty-ring and throttle vCPU until it reachs the quota
> > > dirty page rate given by user.
> > > 
> > > Introduce qmp commands "vcpu-dirty-limit", "query-vcpu-dirty-limit"
> > > to enable, disable, query dirty page limit for virtual CPU.
> > > 
> > > Meanwhile, introduce corresponding hmp commands "vcpu_dirty_limit",
> > > "info vcpu_dirty_limit" so developers can play with them easier.
> > > 
> > > Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
> > 
> > [...]
> > 
> > I see you replaced the interface.  Back to square one...
> > 
> > > diff --git a/qapi/migration.json b/qapi/migration.json
> > > index 3da8fdf..dc15b3f 100644
> > > --- a/qapi/migration.json
> > > +++ b/qapi/migration.json
> > > @@ -1872,6 +1872,54 @@
> > >               'current-rate': 'int64' } }
> > >   ##
> > > +# @vcpu-dirty-limit:
> > > +#
> > > +# Set or cancel the upper limit of dirty page rate for a virtual CPU.
> > > +#
> > > +# Requires KVM with accelerator property "dirty-ring-size" set.
> > > +# A virtual CPU's dirty page rate is a measure of its memory load.
> > > +# To observe dirty page rates, use @calc-dirty-rate.
> > > +#
> > > +# @cpu-index: index of virtual CPU.
> > > +#
> > > +# @enable: true to enable, false to disable.
> > > +#
> > > +# @dirty-rate: upper limit of dirty page rate for virtual CPU.
> > > +#
> > > +# Since: 7.0
> > > +#
> > > +# Example:
> > > +#   {"execute": "vcpu-dirty-limit"}
> > > +#    "arguments": { "cpu-index": 0,
> > > +#                   "enable": true,
> > > +#                   "dirty-rate": 200 } }
> > > +#
> > > +##
> > > +{ 'command': 'vcpu-dirty-limit',
> > > +  'data': { 'cpu-index': 'int',
> > > +            'enable': 'bool',
> > > +            'dirty-rate': 'uint64'} }
> > 
> > When @enable is false, @dirty-rate makes no sense and is not used (I
> > checked the code), but users have to specify it anyway.  That's bad
> > design.
> > 
> > Better: drop @enable, make @dirty-rate optional, present means enable,
> > absent means disable.
> Uh, if we drop @enable, enabling dirty limit should be like:
> vcpu-dirty-limit cpu-index=0 dirty-rate=1000
> 
> And disabling dirty limit like:
> vcpu-dirty-limit cpu-index=0
> 
> For disabling case, there is no hint of disabling in command
> "vcpu-dirty-limit".
> 
> How about make @dirty-rate optional, when enable dirty limit, it should
> present, ignored otherwise?

Sounds good, I think we can make both "enable" and "dirty-rate" optional.

To turn it on we either use "enable=true,dirty-rate=XXX" or "dirty-rate=XXX".

To turn it off we use "enable=false".

> 
> > 
> > > +
> > > +##
> > > +# @query-vcpu-dirty-limit:
> > > +#
> > > +# Returns information about the virtual CPU dirty limit status.
> > > +#
> > > +# @cpu-index: index of the virtual CPU to query, if not specified, all
> > > +#             virtual CPUs will be queried.
> > > +#
> > > +# Since: 7.0
> > > +#
> > > +# Example:
> > > +#   {"execute": "query-vcpu-dirty-limit"}
> > > +#    "arguments": { "cpu-index": 0 } }
> > > +#
> > > +##
> > > +{ 'command': 'query-vcpu-dirty-limit',
> > > +  'data': { '*cpu-index': 'int' },
> > > +  'returns': [ 'DirtyLimitInfo' ] }
> > 
> > Why would anyone ever want to specify @cpu-index?  Output isn't that
> > large even if you have a few hundred CPUs.
> > 
> > Let's keep things simple and drop the parameter.
> Ok, this make things simple.

I found that it'll be challenging for any human being to identify "whether
he/she has turned throttle off for all vcpus"..  I think that could be useful
when we finally decided to cancel current migration.

I thought about adding a "global=on/off" flag, but instead can we just return
the vcpu info for the ones that enabled the per-vcpu throttling?  For anyone
who wants to read all vcpu dirty information he/she can use calc-dirty-rate.

Thanks,
Peter Xu Dec. 6, 2021, 8:36 a.m. UTC | #4
On Fri, Dec 03, 2021 at 09:39:47AM +0800, huangy81@chinatelecom.cn wrote:
> From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
> 
> Implement dirtyrate calculation periodically basing on
> dirty-ring and throttle vCPU until it reachs the quota
> dirty page rate given by user.
> 
> Introduce qmp commands "vcpu-dirty-limit", "query-vcpu-dirty-limit"
> to enable, disable, query dirty page limit for virtual CPU.
> 
> Meanwhile, introduce corresponding hmp commands "vcpu_dirty_limit",
> "info vcpu_dirty_limit" so developers can play with them easier.

Thanks.  Even if I start to use qmp-shell more recently but still in some case
where only hmp is specified this could still be handy.

> +void qmp_vcpu_dirty_limit(int64_t cpu_index,
> +                          bool enable,
> +                          uint64_t dirty_rate,
> +                          Error **errp)
> +{
> +    if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
> +        error_setg(errp, "dirty page limit feature requires KVM with"
> +                   " accelerator property 'dirty-ring-size' set'");
> +        return;
> +    }
> +
> +    if (!dirtylimit_is_vcpu_index_valid(cpu_index)) {
> +        error_setg(errp, "cpu index out of range");
> +        return;
> +    }
> +
> +    if (enable) {
> +        dirtylimit_calc();
> +        dirtylimit_vcpu(cpu_index, dirty_rate);
> +    } else {
> +        if (!dirtylimit_enabled(cpu_index)) {
> +            error_setg(errp, "dirty page limit for CPU %ld not set",
> +                       cpu_index);
> +            return;
> +        }

We don't need to fail the user for enable=off even if vcpu is not throttled,
imho.

> +
> +        if (!dirtylimit_cancel_vcpu(cpu_index)) {
> +            dirtylimit_calc_quit();
> +        }
> +    }
> +}

[...]

> +struct DirtyLimitInfoList *qmp_query_vcpu_dirty_limit(bool has_cpu_index,
> +                                                      int64_t cpu_index,
> +                                                      Error **errp)
> +{
> +    DirtyLimitInfo *info = NULL;
> +    DirtyLimitInfoList *head = NULL, **tail = &head;
> +
> +    if (has_cpu_index &&
> +        (!dirtylimit_is_vcpu_index_valid(cpu_index))) {
> +        error_setg(errp, "cpu index out of range");
> +        return NULL;
> +    }
> +
> +    if (has_cpu_index) {
> +        info = dirtylimit_query_vcpu(cpu_index);
> +        QAPI_LIST_APPEND(tail, info);
> +    } else {
> +        CPUState *cpu;
> +        CPU_FOREACH(cpu) {
> +            if (!cpu->unplug) {
> +                info = dirtylimit_query_vcpu(cpu->cpu_index);
> +                QAPI_LIST_APPEND(tail, info);
> +            }

There're special handling for unplug in a few places.  Could you explain why?
E.g. if the vcpu is unplugged then dirty rate is zero, then it seems fine to
even keep it there?

> +        }
> +    }
> +
> +    return head;
> +}
Peter Xu Dec. 6, 2021, 8:39 a.m. UTC | #5
On Fri, Dec 03, 2021 at 09:39:47AM +0800, huangy81@chinatelecom.cn wrote:
> +void dirtylimit_setup(int max_cpus)
> +{
> +    if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
> +        return;
> +    }
> +
> +    dirtylimit_calc_state_init(max_cpus);
> +    dirtylimit_state_init(max_cpus);
> +}

[...]

> diff --git a/softmmu/vl.c b/softmmu/vl.c
> index 620a1f1..0f83ce3 100644
> --- a/softmmu/vl.c
> +++ b/softmmu/vl.c
> @@ -3777,5 +3777,6 @@ void qemu_init(int argc, char **argv, char **envp)
>      qemu_init_displays();
>      accel_setup_post(current_machine);
>      os_setup_post();
> +    dirtylimit_setup(current_machine->smp.max_cpus);
>      resume_mux_open();

Can we do the init only when someone enables it?  We could also do proper
free() for the structs when it's globally turned off.
Hyman Huang Dec. 6, 2021, 2:56 p.m. UTC | #6
在 2021/12/6 16:28, Peter Xu 写道:
> On Sat, Dec 04, 2021 at 08:00:19PM +0800, Hyman Huang wrote:
>>
>>
>> 在 2021/12/3 20:34, Markus Armbruster 写道:
>>> huangy81@chinatelecom.cn writes:
>>>
>>>> From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
>>>>
>>>> Implement dirtyrate calculation periodically basing on
>>>> dirty-ring and throttle vCPU until it reachs the quota
>>>> dirty page rate given by user.
>>>>
>>>> Introduce qmp commands "vcpu-dirty-limit", "query-vcpu-dirty-limit"
>>>> to enable, disable, query dirty page limit for virtual CPU.
>>>>
>>>> Meanwhile, introduce corresponding hmp commands "vcpu_dirty_limit",
>>>> "info vcpu_dirty_limit" so developers can play with them easier.
>>>>
>>>> Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
>>>
>>> [...]
>>>
>>> I see you replaced the interface.  Back to square one...
>>>
>>>> diff --git a/qapi/migration.json b/qapi/migration.json
>>>> index 3da8fdf..dc15b3f 100644
>>>> --- a/qapi/migration.json
>>>> +++ b/qapi/migration.json
>>>> @@ -1872,6 +1872,54 @@
>>>>                'current-rate': 'int64' } }
>>>>    ##
>>>> +# @vcpu-dirty-limit:
>>>> +#
>>>> +# Set or cancel the upper limit of dirty page rate for a virtual CPU.
>>>> +#
>>>> +# Requires KVM with accelerator property "dirty-ring-size" set.
>>>> +# A virtual CPU's dirty page rate is a measure of its memory load.
>>>> +# To observe dirty page rates, use @calc-dirty-rate.
>>>> +#
>>>> +# @cpu-index: index of virtual CPU.
>>>> +#
>>>> +# @enable: true to enable, false to disable.
>>>> +#
>>>> +# @dirty-rate: upper limit of dirty page rate for virtual CPU.
>>>> +#
>>>> +# Since: 7.0
>>>> +#
>>>> +# Example:
>>>> +#   {"execute": "vcpu-dirty-limit"}
>>>> +#    "arguments": { "cpu-index": 0,
>>>> +#                   "enable": true,
>>>> +#                   "dirty-rate": 200 } }
>>>> +#
>>>> +##
>>>> +{ 'command': 'vcpu-dirty-limit',
>>>> +  'data': { 'cpu-index': 'int',
>>>> +            'enable': 'bool',
>>>> +            'dirty-rate': 'uint64'} }
>>>
>>> When @enable is false, @dirty-rate makes no sense and is not used (I
>>> checked the code), but users have to specify it anyway.  That's bad
>>> design.
>>>
>>> Better: drop @enable, make @dirty-rate optional, present means enable,
>>> absent means disable.
>> Uh, if we drop @enable, enabling dirty limit should be like:
>> vcpu-dirty-limit cpu-index=0 dirty-rate=1000
>>
>> And disabling dirty limit like:
>> vcpu-dirty-limit cpu-index=0
>>
>> For disabling case, there is no hint of disabling in command
>> "vcpu-dirty-limit".
>>
>> How about make @dirty-rate optional, when enable dirty limit, it should
>> present, ignored otherwise?
> 
> Sounds good, I think we can make both "enable" and "dirty-rate" optional.
> 
> To turn it on we either use "enable=true,dirty-rate=XXX" or "dirty-rate=XXX" >
> To turn it off we use "enable=false".
Indeed, this make things more convenient.
>  >>
>>>
>>>> +
>>>> +##
>>>> +# @query-vcpu-dirty-limit:
>>>> +#
>>>> +# Returns information about the virtual CPU dirty limit status.
>>>> +#
>>>> +# @cpu-index: index of the virtual CPU to query, if not specified, all
>>>> +#             virtual CPUs will be queried.
>>>> +#
>>>> +# Since: 7.0
>>>> +#
>>>> +# Example:
>>>> +#   {"execute": "query-vcpu-dirty-limit"}
>>>> +#    "arguments": { "cpu-index": 0 } }
>>>> +#
>>>> +##
>>>> +{ 'command': 'query-vcpu-dirty-limit',
>>>> +  'data': { '*cpu-index': 'int' },
>>>> +  'returns': [ 'DirtyLimitInfo' ] }
>>>
>>> Why would anyone ever want to specify @cpu-index?  Output isn't that
>>> large even if you have a few hundred CPUs.
>>>
>>> Let's keep things simple and drop the parameter.
>> Ok, this make things simple.
> 
> I found that it'll be challenging for any human being to identify "whether
> he/she has turned throttle off for all vcpus"..  I think that could be useful
> when we finally decided to cancel current migration.
That's question, how about adding an optional argument "global" and 
making "cpu-index", "enable", "dirty-rate" all optional in 
"vcpu-dirty-limit", keeping the "cpu-index" and "global" options 
mutually exclusive?
{ 'command': 'vcpu-dirty-limit',
   'data': { '*cpu-index': 'int',
             '*global': 'bool'
             '*enable': 'bool',
             '*dirty-rate': 'uint64'} }
In the case of enabling all vcpu throttle:
Either use "global=true,enable=true,dirty-rate=XXX" or 
"global=true,dirty-rate=XXX"

In the case of disabling all vcpu throttle:
use "global=true,enable=false,dirty-rate=XXX"

In other case, we pass the same option just like what we did for 
specified vcpu throttle before.
> 
> I thought about adding a "global=on/off" flag, but instead can we just return
> the vcpu info for the ones that enabled the per-vcpu throttling?  For anyone
> who wants to read all vcpu dirty information he/she can use calc-dirty-rate.
> 
Ok, I'll pick up this advice next version.
> Thanks,
>
Hyman Huang Dec. 6, 2021, 3:19 p.m. UTC | #7
在 2021/12/6 16:36, Peter Xu 写道:
> On Fri, Dec 03, 2021 at 09:39:47AM +0800, huangy81@chinatelecom.cn wrote:
>> From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
>>
>> Implement dirtyrate calculation periodically basing on
>> dirty-ring and throttle vCPU until it reachs the quota
>> dirty page rate given by user.
>>
>> Introduce qmp commands "vcpu-dirty-limit", "query-vcpu-dirty-limit"
>> to enable, disable, query dirty page limit for virtual CPU.
>>
>> Meanwhile, introduce corresponding hmp commands "vcpu_dirty_limit",
>> "info vcpu_dirty_limit" so developers can play with them easier.
> 
> Thanks.  Even if I start to use qmp-shell more recently but still in some case
> where only hmp is specified this could still be handy.
> 
>> +void qmp_vcpu_dirty_limit(int64_t cpu_index,
>> +                          bool enable,
>> +                          uint64_t dirty_rate,
>> +                          Error **errp)
>> +{
>> +    if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
>> +        error_setg(errp, "dirty page limit feature requires KVM with"
>> +                   " accelerator property 'dirty-ring-size' set'");
>> +        return;
>> +    }
>> +
>> +    if (!dirtylimit_is_vcpu_index_valid(cpu_index)) {
>> +        error_setg(errp, "cpu index out of range");
>> +        return;
>> +    }
>> +
>> +    if (enable) {
>> +        dirtylimit_calc();
>> +        dirtylimit_vcpu(cpu_index, dirty_rate);
>> +    } else {
>> +        if (!dirtylimit_enabled(cpu_index)) {
>> +            error_setg(errp, "dirty page limit for CPU %ld not set",
>> +                       cpu_index);
>> +            return;
>> +        }
> 
> We don't need to fail the user for enable=off even if vcpu is not throttled,
> imho.
Ok.
> 
>> +
>> +        if (!dirtylimit_cancel_vcpu(cpu_index)) {
>> +            dirtylimit_calc_quit();
>> +        }
>> +    }
>> +}
> 
> [...]
> 
>> +struct DirtyLimitInfoList *qmp_query_vcpu_dirty_limit(bool has_cpu_index,
>> +                                                      int64_t cpu_index,
>> +                                                      Error **errp)
>> +{
>> +    DirtyLimitInfo *info = NULL;
>> +    DirtyLimitInfoList *head = NULL, **tail = &head;
>> +
>> +    if (has_cpu_index &&
>> +        (!dirtylimit_is_vcpu_index_valid(cpu_index))) {
>> +        error_setg(errp, "cpu index out of range");
>> +        return NULL;
>> +    }
>> +
>> +    if (has_cpu_index) {
>> +        info = dirtylimit_query_vcpu(cpu_index);
>> +        QAPI_LIST_APPEND(tail, info);
>> +    } else {
>> +        CPUState *cpu;
>> +        CPU_FOREACH(cpu) {
>> +            if (!cpu->unplug) {
>> +                info = dirtylimit_query_vcpu(cpu->cpu_index);
>> +                QAPI_LIST_APPEND(tail, info);
>> +            }
> 
> There're special handling for unplug in a few places.  Could you explain why?
> E.g. if the vcpu is unplugged then dirty rate is zero, then it seems fine to
> even keep it there?
> The dirty limit logic only allow plugged vcpu to be enabled throttle, so 
that the "dirtylimit-{cpu-index}" thread don't need to be forked and we 
can save the overhead. So in query logic we just filter the unplugged vcpu.

Another reason is that i thought it could make user confused when we 
return the unplugged vcpu dirtylimit info. Uh, in most time of vm 
lifecycle, hotplugging vcpu may never happen.
>> +        }
>> +    }
>> +
>> +    return head;
>> +}
>
Hyman Huang Dec. 6, 2021, 3:22 p.m. UTC | #8
在 2021/12/6 16:39, Peter Xu 写道:
> On Fri, Dec 03, 2021 at 09:39:47AM +0800, huangy81@chinatelecom.cn wrote:
>> +void dirtylimit_setup(int max_cpus)
>> +{
>> +    if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
>> +        return;
>> +    }
>> +
>> +    dirtylimit_calc_state_init(max_cpus);
>> +    dirtylimit_state_init(max_cpus);
>> +}
> 
> [...]
> 
>> diff --git a/softmmu/vl.c b/softmmu/vl.c
>> index 620a1f1..0f83ce3 100644
>> --- a/softmmu/vl.c
>> +++ b/softmmu/vl.c
>> @@ -3777,5 +3777,6 @@ void qemu_init(int argc, char **argv, char **envp)
>>       qemu_init_displays();
>>       accel_setup_post(current_machine);
>>       os_setup_post();
>> +    dirtylimit_setup(current_machine->smp.max_cpus);
>>       resume_mux_open();
> 
> Can we do the init only when someone enables it?  We could also do proper
> free() for the structs when it's globally turned off.
Yes, i'll try this next version
>
Peter Xu Dec. 7, 2021, 2:24 a.m. UTC | #9
On Mon, Dec 06, 2021 at 10:56:00PM +0800, Hyman wrote:
> > I found that it'll be challenging for any human being to identify "whether
> > he/she has turned throttle off for all vcpus"..  I think that could be useful
> > when we finally decided to cancel current migration.
> That's question, how about adding an optional argument "global" and making
> "cpu-index", "enable", "dirty-rate" all optional in "vcpu-dirty-limit",
> keeping the "cpu-index" and "global" options mutually exclusive?
> { 'command': 'vcpu-dirty-limit',
>   'data': { '*cpu-index': 'int',
>             '*global': 'bool'
>             '*enable': 'bool',
>             '*dirty-rate': 'uint64'} }
> In the case of enabling all vcpu throttle:
> Either use "global=true,enable=true,dirty-rate=XXX" or
> "global=true,dirty-rate=XXX"
> 
> In the case of disabling all vcpu throttle:
> use "global=true,enable=false,dirty-rate=XXX"
> 
> In other case, we pass the same option just like what we did for specified
> vcpu throttle before.

Could we merge "cpu-index" and "global" somehow?  They're mutual exclusive.

For example, merge them into one "vcpu" parameter, "vcpu=all" means global,
"vcpu=1" means vcpu 1.  But then we'll need to make it a string.
Peter Xu Dec. 7, 2021, 2:57 a.m. UTC | #10
On Mon, Dec 06, 2021 at 11:19:21PM +0800, Hyman wrote:
> > > +    if (has_cpu_index) {
> > > +        info = dirtylimit_query_vcpu(cpu_index);
> > > +        QAPI_LIST_APPEND(tail, info);
> > > +    } else {
> > > +        CPUState *cpu;
> > > +        CPU_FOREACH(cpu) {
> > > +            if (!cpu->unplug) {
> > > +                info = dirtylimit_query_vcpu(cpu->cpu_index);
> > > +                QAPI_LIST_APPEND(tail, info);
> > > +            }
> > 
> > There're special handling for unplug in a few places.  Could you explain why?
> > E.g. if the vcpu is unplugged then dirty rate is zero, then it seems fine to
> > even keep it there?
> > The dirty limit logic only allow plugged vcpu to be enabled throttle, so
> that the "dirtylimit-{cpu-index}" thread don't need to be forked and we can
> save the overhead. So in query logic we just filter the unplugged vcpu.

I've commented similarly in the other thread - please consider not using NVCPU
threads only for vcpu throttling, irrelevant of vcpu hot plug/unplug.

Per-vcpu throttle is totally not a cpu intensive workload, 1 thread should be
enough globally, imho.

A guest with hundreds of vcpus are becoming more common, we shouldn't waste OS
thread resources just for this.

> 
> Another reason is that i thought it could make user confused when we return
> the unplugged vcpu dirtylimit info. Uh, in most time of vm lifecycle,
> hotplugging vcpu may never happen.

I just think if plug/unplug does not affect the throttle logic then we should
treat them the same, it avoids unnecessary special care on those vcpus too.
Hyman Huang Dec. 7, 2021, 4:32 a.m. UTC | #11
在 2021/12/7 10:24, Peter Xu 写道:
> On Mon, Dec 06, 2021 at 10:56:00PM +0800, Hyman wrote:
>>> I found that it'll be challenging for any human being to identify "whether
>>> he/she has turned throttle off for all vcpus"..  I think that could be useful
>>> when we finally decided to cancel current migration.
>> That's question, how about adding an optional argument "global" and making
>> "cpu-index", "enable", "dirty-rate" all optional in "vcpu-dirty-limit",
>> keeping the "cpu-index" and "global" options mutually exclusive?
>> { 'command': 'vcpu-dirty-limit',
>>    'data': { '*cpu-index': 'int',
>>              '*global': 'bool'
>>              '*enable': 'bool',
>>              '*dirty-rate': 'uint64'} }
>> In the case of enabling all vcpu throttle:
>> Either use "global=true,enable=true,dirty-rate=XXX" or
>> "global=true,dirty-rate=XXX"
>>
>> In the case of disabling all vcpu throttle:
>> use "global=true,enable=false,dirty-rate=XXX"
>>
>> In other case, we pass the same option just like what we did for specified
>> vcpu throttle before.
> 
> Could we merge "cpu-index" and "global" somehow?  They're mutual exclusive. >
> For example, merge them into one "vcpu" parameter, "vcpu=all" means global,
> "vcpu=1" means vcpu 1.  But then we'll need to make it a string.
>Ok, sound good
Hyman Huang Dec. 7, 2021, 4:38 a.m. UTC | #12
在 2021/12/7 10:57, Peter Xu 写道:
> On Mon, Dec 06, 2021 at 11:19:21PM +0800, Hyman wrote:
>>>> +    if (has_cpu_index) {
>>>> +        info = dirtylimit_query_vcpu(cpu_index);
>>>> +        QAPI_LIST_APPEND(tail, info);
>>>> +    } else {
>>>> +        CPUState *cpu;
>>>> +        CPU_FOREACH(cpu) {
>>>> +            if (!cpu->unplug) {
>>>> +                info = dirtylimit_query_vcpu(cpu->cpu_index);
>>>> +                QAPI_LIST_APPEND(tail, info);
>>>> +            }
>>>
>>> There're special handling for unplug in a few places.  Could you explain why?
>>> E.g. if the vcpu is unplugged then dirty rate is zero, then it seems fine to
>>> even keep it there?
>>> The dirty limit logic only allow plugged vcpu to be enabled throttle, so
>> that the "dirtylimit-{cpu-index}" thread don't need to be forked and we can
>> save the overhead. So in query logic we just filter the unplugged vcpu.
> 
> I've commented similarly in the other thread - please consider not using NVCPU
> threads only for vcpu throttling, irrelevant of vcpu hot plug/unplug.
> 
> Per-vcpu throttle is totally not a cpu intensive workload, 1 thread should be
> enough globally, imho.
> 
> A guest with hundreds of vcpus are becoming more common, we shouldn't waste OS
> thread resources just for this.
> 
Ok, i'll try this out next version
>>
>> Another reason is that i thought it could make user confused when we return
>> the unplugged vcpu dirtylimit info. Uh, in most time of vm lifecycle,
>> hotplugging vcpu may never happen.
> 
> I just think if plug/unplug does not affect the throttle logic then we should
> treat them the same, it avoids unnecessary special care on those vcpus too.
> 
Indeed, i'm struggling too :), i'll remove the plug/unplug logic the 
next version.
diff mbox series

Patch

diff --git a/cpus-common.c b/cpus-common.c
index 6e73d3e..04b9bc9 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -23,6 +23,14 @@ 
 #include "hw/core/cpu.h"
 #include "sysemu/cpus.h"
 #include "qemu/lockable.h"
+#include "sysemu/dirtylimit.h"
+#include "sysemu/cpu-throttle.h"
+#include "sysemu/kvm.h"
+#include "monitor/hmp.h"
+#include "monitor/monitor.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/error.h"
+#include "qapi/qapi-commands-migration.h"
 
 static QemuMutex qemu_cpu_list_lock;
 static QemuCond exclusive_cond;
@@ -352,3 +360,144 @@  void process_queued_cpu_work(CPUState *cpu)
     qemu_mutex_unlock(&cpu->work_mutex);
     qemu_cond_broadcast(&qemu_work_cond);
 }
+
+void qmp_vcpu_dirty_limit(int64_t cpu_index,
+                          bool enable,
+                          uint64_t dirty_rate,
+                          Error **errp)
+{
+    if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
+        error_setg(errp, "dirty page limit feature requires KVM with"
+                   " accelerator property 'dirty-ring-size' set'");
+        return;
+    }
+
+    if (!dirtylimit_is_vcpu_index_valid(cpu_index)) {
+        error_setg(errp, "cpu index out of range");
+        return;
+    }
+
+    if (enable) {
+        dirtylimit_calc();
+        dirtylimit_vcpu(cpu_index, dirty_rate);
+    } else {
+        if (!dirtylimit_enabled(cpu_index)) {
+            error_setg(errp, "dirty page limit for CPU %ld not set",
+                       cpu_index);
+            return;
+        }
+
+        if (!dirtylimit_cancel_vcpu(cpu_index)) {
+            dirtylimit_calc_quit();
+        }
+    }
+}
+
+void hmp_vcpu_dirty_limit(Monitor *mon, const QDict *qdict)
+{
+    int64_t cpu_index = qdict_get_try_int(qdict, "cpu_index", -1);
+    int64_t dirty_rate = qdict_get_try_int(qdict, "dirty_rate", -1);
+    bool enable = qdict_get_bool(qdict, "enable");
+    Error *err = NULL;
+
+    if (enable && dirty_rate < 0) {
+        monitor_printf(mon, "Enabling dirty limit requires dirty_rate set!\n");
+        return;
+    }
+
+    if (!dirtylimit_is_vcpu_index_valid(cpu_index)) {
+        monitor_printf(mon, "Incorrect cpu index specified!\n");
+        return;
+    }
+
+    qmp_vcpu_dirty_limit(cpu_index, enable, dirty_rate, &err);
+
+    if (err) {
+        hmp_handle_error(mon, err);
+        return;
+    }
+
+    monitor_printf(mon, "Enabling dirty page rate limit with %"PRIi64
+                   " MB/s\n", dirty_rate);
+
+    monitor_printf(mon, "[Please use 'info vcpu_dirty_limit' to query "
+                   "dirty limit for virtual CPU]\n");
+}
+
+struct DirtyLimitInfoList *qmp_query_vcpu_dirty_limit(bool has_cpu_index,
+                                                      int64_t cpu_index,
+                                                      Error **errp)
+{
+    DirtyLimitInfo *info = NULL;
+    DirtyLimitInfoList *head = NULL, **tail = &head;
+
+    if (has_cpu_index &&
+        (!dirtylimit_is_vcpu_index_valid(cpu_index))) {
+        error_setg(errp, "cpu index out of range");
+        return NULL;
+    }
+
+    if (has_cpu_index) {
+        info = dirtylimit_query_vcpu(cpu_index);
+        QAPI_LIST_APPEND(tail, info);
+    } else {
+        CPUState *cpu;
+        CPU_FOREACH(cpu) {
+            if (!cpu->unplug) {
+                info = dirtylimit_query_vcpu(cpu->cpu_index);
+                QAPI_LIST_APPEND(tail, info);
+            }
+        }
+    }
+
+    return head;
+}
+
+void hmp_info_vcpu_dirty_limit(Monitor *mon, const QDict *qdict)
+{
+    DirtyLimitInfoList *limit, *head, *info = NULL;
+    int64_t cpu_index = qdict_get_try_int(qdict, "cpu_index", -1);
+    Error *err = NULL;
+
+    if (cpu_index >=0 &&
+        !dirtylimit_is_vcpu_index_valid(cpu_index)) {
+        monitor_printf(mon, "cpu index out of range\n");
+        return;
+    }
+
+    if (cpu_index < 0) {
+        info = qmp_query_vcpu_dirty_limit(false, -1, &err);
+    } else {
+        info = qmp_query_vcpu_dirty_limit(true, cpu_index, &err);
+    }
+
+    if (err) {
+        hmp_handle_error(mon, err);
+        return;
+    }
+
+    head = info;
+    for (limit = head; limit != NULL; limit = limit->next) {
+        monitor_printf(mon, "vcpu[%"PRIi64"], Enable: %s",
+                       limit->value->cpu_index,
+                       limit->value->enable ? "true" : "false");
+        if (limit->value->enable) {
+            monitor_printf(mon, ", limit rate %"PRIi64 " (MB/s),"
+                           " current rate %"PRIi64 " (MB/s)\n",
+                           limit->value->limit_rate,
+                           limit->value->current_rate);
+        } else {
+            monitor_printf(mon, "\n");
+        }
+    }
+}
+
+void dirtylimit_setup(int max_cpus)
+{
+    if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
+        return;
+    }
+
+    dirtylimit_calc_state_init(max_cpus);
+    dirtylimit_state_init(max_cpus);
+}
diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index 407a1da..aff28d9 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -863,6 +863,19 @@  SRST
     Display the vcpu dirty rate information.
 ERST
 
+    {
+        .name       = "vcpu_dirty_limit",
+        .args_type  = "cpu_index:l?",
+        .params     = "cpu_index",
+        .help       = "show dirty page limit information",
+        .cmd        = hmp_info_vcpu_dirty_limit,
+    },
+
+SRST
+  ``info vcpu_dirty_limit``
+    Display the vcpu dirty page limit information.
+ERST
+
 #if defined(TARGET_I386)
     {
         .name       = "sgx",
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 70a9136..1839fa4 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1744,3 +1744,19 @@  ERST
                       "\n\t\t\t -b to specify dirty bitmap as method of calculation)",
         .cmd        = hmp_calc_dirty_rate,
     },
+
+SRST
+``vcpu_dirty_limit``
+  Start dirty page rate limit on a virtual CPU, the information about all the
+  virtual CPU dirty limit status can be observed with ``info vcpu_dirty_limit``
+  command.
+ERST
+
+    {
+        .name       = "vcpu_dirty_limit",
+        .args_type  = "cpu_index:l,enable:b,dirty_rate:l?",
+        .params     = "cpu_index on|off [dirty_rate]",
+        .help       = "enable, disable dirty page rate limit on a virtual CPU"
+                      "(dirty_rate should be specified dirty_rate if enable)",
+        .cmd        = hmp_vcpu_dirty_limit,
+    },
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index e948e81..11df012 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -881,6 +881,15 @@  void end_exclusive(void);
  */
 void qemu_init_vcpu(CPUState *cpu);
 
+/**
+ * dirtylimit_setup:
+ *
+ * Initializes the global state of dirtylimit calculation and
+ * dirtylimit itself. This is prepared for vCPU dirtylimit which
+ * could be triggered during vm lifecycle.
+ */
+void dirtylimit_setup(int max_cpus);
+
 #define SSTEP_ENABLE  0x1  /* Enable simulated HW single stepping */
 #define SSTEP_NOIRQ   0x2  /* Do not use IRQ while single stepping */
 #define SSTEP_NOTIMER 0x4  /* Do not Timers while single stepping */
diff --git a/include/monitor/hmp.h b/include/monitor/hmp.h
index 96d0148..04879a2 100644
--- a/include/monitor/hmp.h
+++ b/include/monitor/hmp.h
@@ -131,6 +131,8 @@  void hmp_replay_delete_break(Monitor *mon, const QDict *qdict);
 void hmp_replay_seek(Monitor *mon, const QDict *qdict);
 void hmp_info_dirty_rate(Monitor *mon, const QDict *qdict);
 void hmp_calc_dirty_rate(Monitor *mon, const QDict *qdict);
+void hmp_info_vcpu_dirty_limit(Monitor *mon, const QDict *qdict);
+void hmp_vcpu_dirty_limit(Monitor *mon, const QDict *qdict);
 void hmp_human_readable_text_helper(Monitor *mon,
                                     HumanReadableText *(*qmp_handler)(Error **));
 
diff --git a/qapi/migration.json b/qapi/migration.json
index 3da8fdf..dc15b3f 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -1872,6 +1872,54 @@ 
             'current-rate': 'int64' } }
 
 ##
+# @vcpu-dirty-limit:
+#
+# Set or cancel the upper limit of dirty page rate for a virtual CPU.
+#
+# Requires KVM with accelerator property "dirty-ring-size" set.
+# A virtual CPU's dirty page rate is a measure of its memory load.
+# To observe dirty page rates, use @calc-dirty-rate.
+#
+# @cpu-index: index of virtual CPU.
+#
+# @enable: true to enable, false to disable.
+#
+# @dirty-rate: upper limit of dirty page rate for virtual CPU.
+#
+# Since: 7.0
+#
+# Example:
+#   {"execute": "vcpu-dirty-limit"}
+#    "arguments": { "cpu-index": 0,
+#                   "enable": true,
+#                   "dirty-rate": 200 } }
+#
+##
+{ 'command': 'vcpu-dirty-limit',
+  'data': { 'cpu-index': 'int',
+            'enable': 'bool',
+            'dirty-rate': 'uint64'} }
+
+##
+# @query-vcpu-dirty-limit:
+#
+# Returns information about the virtual CPU dirty limit status.
+#
+# @cpu-index: index of the virtual CPU to query, if not specified, all
+#             virtual CPUs will be queried.
+#
+# Since: 7.0
+#
+# Example:
+#   {"execute": "query-vcpu-dirty-limit"}
+#    "arguments": { "cpu-index": 0 } }
+#
+##
+{ 'command': 'query-vcpu-dirty-limit',
+  'data': { '*cpu-index': 'int' },
+  'returns': [ 'DirtyLimitInfo' ] }
+
+##
 # @snapshot-save:
 #
 # Save a VM snapshot
diff --git a/softmmu/vl.c b/softmmu/vl.c
index 620a1f1..0f83ce3 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -3777,5 +3777,6 @@  void qemu_init(int argc, char **argv, char **envp)
     qemu_init_displays();
     accel_setup_post(current_machine);
     os_setup_post();
+    dirtylimit_setup(current_machine->smp.max_cpus);
     resume_mux_open();
 }