mbox series

[RFC,0/8] *** A Method for evaluating dirty page rate ***

Message ID 1595646669-109310-1-git-send-email-zhengchuan@huawei.com (mailing list archive)
Headers show
Series *** A Method for evaluating dirty page rate *** | expand

Message

Zheng Chuan July 25, 2020, 3:11 a.m. UTC
From: Zheng Chuan <zhengchuan@huawei.com>

Sometimes it is neccessary to evaluate dirty page rate before migration.
Users could decide whether to proceed migration based on the evaluation
in case of vm performance loss due to heavy workload.
Unlikey simulating dirtylog sync which could do harm on runnning vm,
we provide a sample-hash method to compare hash results for samping page.
In this way, it would have hardly no impact on vm performance.

We evaluate the dirtypage rate on running vm.
The VM specifications for migration are as follows:
- VM use 4-K page;
- the number of VCPU is 32;
- the total memory is 32Gigabit;
- use 'mempress' tool to pressurize VM(mempress 4096 1024);

++++++++++++++++++++++++++++++++++++++++++
|                      |    dirtyrate    |
++++++++++++++++++++++++++++++++++++++++++
| no mempress          |     4MB/s       |
------------------------------------------
| mempress 4096 1024   |    1204MB/s     |
++++++++++++++++++++++++++++++++++++++++++
| mempress 4096 4096   |    4000Mb/s     |
++++++++++++++++++++++++++++++++++++++++++

Test dirtyrate by qmp command like this:
1.  virsh qemu-monitor-command [vmname] '{"execute":"cal_dirty_rate", "arguments": {"value": [sampletime]}}'
2.  virsh qemu-monitor-command [vmname] '{"execute":"get_dirty_rate"}'

Further test dirtyrate by libvirt api like this:
virsh getdirtyrate [vmname] [sampletime]

Zheng Chuan (8):
  migration/dirtyrate: Add get_dirtyrate_thread() function
  migration/dirtyrate: Add block_dirty_info to store dirtypage info
  migration/dirtyrate: Add dirtyrate statistics series functions
  migration/dirtyrate: Record hash results for each ramblock
  migration/dirtyrate: Compare hash results for recorded ramblock
  migration/dirtyrate: Implement get_sample_gap_period() and
    block_sample_gap_period()
  migration/dirtyrate: Implement calculate_dirtyrate() function
  migration/dirtyrate: Implement
    qmp_cal_dirty_rate()/qmp_get_dirty_rate() function

 migration/Makefile.objs |   1 +
 migration/dirtyrate.c   | 424 ++++++++++++++++++++++++++++++++++++++++++++++++
 migration/dirtyrate.h   |  67 ++++++++
 qapi/migration.json     |  24 +++
 qapi/pragma.json        |   3 +-
 5 files changed, 518 insertions(+), 1 deletion(-)
 create mode 100644 migration/dirtyrate.c
 create mode 100644 migration/dirtyrate.h

Comments

Dr. David Alan Gilbert Aug. 4, 2020, 4:19 p.m. UTC | #1
* Chuan Zheng (zhengchuan@huawei.com) wrote:
> From: Zheng Chuan <zhengchuan@huawei.com>

Hi,

> Sometimes it is neccessary to evaluate dirty page rate before migration.
> Users could decide whether to proceed migration based on the evaluation
> in case of vm performance loss due to heavy workload.
> Unlikey simulating dirtylog sync which could do harm on runnning vm,
> we provide a sample-hash method to compare hash results for samping page.
> In this way, it would have hardly no impact on vm performance.
> 
> We evaluate the dirtypage rate on running vm.
> The VM specifications for migration are as follows:
> - VM use 4-K page;
> - the number of VCPU is 32;
> - the total memory is 32Gigabit;
> - use 'mempress' tool to pressurize VM(mempress 4096 1024);
> 
> ++++++++++++++++++++++++++++++++++++++++++
> |                      |    dirtyrate    |
> ++++++++++++++++++++++++++++++++++++++++++
> | no mempress          |     4MB/s       |
> ------------------------------------------
> | mempress 4096 1024   |    1204MB/s     |
> ++++++++++++++++++++++++++++++++++++++++++
> | mempress 4096 4096   |    4000Mb/s     |
> ++++++++++++++++++++++++++++++++++++++++++

This is quite neat; I know we've got other people who have asked
for a similar feature!
Have you tried to validate these numbers against a real migration - e.g.
try setting mempress to dirty just under 1GByte/s and see if you can
migrate it over a 10Gbps link?

Dave

> Test dirtyrate by qmp command like this:
> 1.  virsh qemu-monitor-command [vmname] '{"execute":"cal_dirty_rate", "arguments": {"value": [sampletime]}}'
> 2.  virsh qemu-monitor-command [vmname] '{"execute":"get_dirty_rate"}'
> 
> Further test dirtyrate by libvirt api like this:
> virsh getdirtyrate [vmname] [sampletime]
> 
> Zheng Chuan (8):
>   migration/dirtyrate: Add get_dirtyrate_thread() function
>   migration/dirtyrate: Add block_dirty_info to store dirtypage info
>   migration/dirtyrate: Add dirtyrate statistics series functions
>   migration/dirtyrate: Record hash results for each ramblock
>   migration/dirtyrate: Compare hash results for recorded ramblock
>   migration/dirtyrate: Implement get_sample_gap_period() and
>     block_sample_gap_period()
>   migration/dirtyrate: Implement calculate_dirtyrate() function
>   migration/dirtyrate: Implement
>     qmp_cal_dirty_rate()/qmp_get_dirty_rate() function
> 
>  migration/Makefile.objs |   1 +
>  migration/dirtyrate.c   | 424 ++++++++++++++++++++++++++++++++++++++++++++++++
>  migration/dirtyrate.h   |  67 ++++++++
>  qapi/migration.json     |  24 +++
>  qapi/pragma.json        |   3 +-
>  5 files changed, 518 insertions(+), 1 deletion(-)
>  create mode 100644 migration/dirtyrate.c
>  create mode 100644 migration/dirtyrate.h
> 
> -- 
> 1.8.3.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Zheng Chuan Aug. 6, 2020, 7:36 a.m. UTC | #2
On 2020/8/5 0:19, Dr. David Alan Gilbert wrote:
> * Chuan Zheng (zhengchuan@huawei.com) wrote:
>> From: Zheng Chuan <zhengchuan@huawei.com>
> 
> Hi,
> 
>> Sometimes it is neccessary to evaluate dirty page rate before migration.
>> Users could decide whether to proceed migration based on the evaluation
>> in case of vm performance loss due to heavy workload.
>> Unlikey simulating dirtylog sync which could do harm on runnning vm,
>> we provide a sample-hash method to compare hash results for samping page.
>> In this way, it would have hardly no impact on vm performance.
>>
>> We evaluate the dirtypage rate on running vm.
>> The VM specifications for migration are as follows:
>> - VM use 4-K page;
>> - the number of VCPU is 32;
>> - the total memory is 32Gigabit;
>> - use 'mempress' tool to pressurize VM(mempress 4096 1024);
>>
>> ++++++++++++++++++++++++++++++++++++++++++
>> |                      |    dirtyrate    |
>> ++++++++++++++++++++++++++++++++++++++++++
>> | no mempress          |     4MB/s       |
>> ------------------------------------------
>> | mempress 4096 1024   |    1204MB/s     |
>> ++++++++++++++++++++++++++++++++++++++++++
>> | mempress 4096 4096   |    4000Mb/s     |
>> ++++++++++++++++++++++++++++++++++++++++++
> 
> This is quite neat; I know we've got other people who have asked
> for a similar feature!
> Have you tried to validate these numbers against a real migration - e.g.
> try setting mempress to dirty just under 1GByte/s and see if you can
> migrate it over a 10Gbps link?
> 
> Dave
> 
Hi, Dave.
Thank you for your review.

Note that, the original intention is evaluating dirty rate before migration.

However, I test dirty rate against a real migration over a bandwidth of 10Gps with various mempress, which shows as below:
++++++++++++++++++++++++++++++++++++++++++
|                      |    dirtyrate    |
++++++++++++++++++++++++++++++++++++++++++
| no mempress          |     8MB/s       |
------------------------------------------
| mempress 4096 1024   |    1188MB/s     |
++++++++++++++++++++++++++++++++++++++++++

It looks still close to actual dirty rate:)

Test results against a real migration will be posted in V2.

>> Test dirtyrate by qmp command like this:
>> 1.  virsh qemu-monitor-command [vmname] '{"execute":"cal_dirty_rate", "arguments": {"value": [sampletime]}}'
>> 2.  virsh qemu-monitor-command [vmname] '{"execute":"get_dirty_rate"}'
>>
>> Further test dirtyrate by libvirt api like this:
>> virsh getdirtyrate [vmname] [sampletime]
>>
>> Zheng Chuan (8):
>>   migration/dirtyrate: Add get_dirtyrate_thread() function
>>   migration/dirtyrate: Add block_dirty_info to store dirtypage info
>>   migration/dirtyrate: Add dirtyrate statistics series functions
>>   migration/dirtyrate: Record hash results for each ramblock
>>   migration/dirtyrate: Compare hash results for recorded ramblock
>>   migration/dirtyrate: Implement get_sample_gap_period() and
>>     block_sample_gap_period()
>>   migration/dirtyrate: Implement calculate_dirtyrate() function
>>   migration/dirtyrate: Implement
>>     qmp_cal_dirty_rate()/qmp_get_dirty_rate() function
>>
>>  migration/Makefile.objs |   1 +
>>  migration/dirtyrate.c   | 424 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  migration/dirtyrate.h   |  67 ++++++++
>>  qapi/migration.json     |  24 +++
>>  qapi/pragma.json        |   3 +-
>>  5 files changed, 518 insertions(+), 1 deletion(-)
>>  create mode 100644 migration/dirtyrate.c
>>  create mode 100644 migration/dirtyrate.h
>>
>> -- 
>> 1.8.3.1
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 
> 
> .
>
Dr. David Alan Gilbert Aug. 6, 2020, 4:58 p.m. UTC | #3
* Zheng Chuan (zhengchuan@huawei.com) wrote:
> 
> 
> On 2020/8/5 0:19, Dr. David Alan Gilbert wrote:
> > * Chuan Zheng (zhengchuan@huawei.com) wrote:
> >> From: Zheng Chuan <zhengchuan@huawei.com>
> > 
> > Hi,
> > 
> >> Sometimes it is neccessary to evaluate dirty page rate before migration.
> >> Users could decide whether to proceed migration based on the evaluation
> >> in case of vm performance loss due to heavy workload.
> >> Unlikey simulating dirtylog sync which could do harm on runnning vm,
> >> we provide a sample-hash method to compare hash results for samping page.
> >> In this way, it would have hardly no impact on vm performance.
> >>
> >> We evaluate the dirtypage rate on running vm.
> >> The VM specifications for migration are as follows:
> >> - VM use 4-K page;
> >> - the number of VCPU is 32;
> >> - the total memory is 32Gigabit;
> >> - use 'mempress' tool to pressurize VM(mempress 4096 1024);
> >>
> >> ++++++++++++++++++++++++++++++++++++++++++
> >> |                      |    dirtyrate    |
> >> ++++++++++++++++++++++++++++++++++++++++++
> >> | no mempress          |     4MB/s       |
> >> ------------------------------------------
> >> | mempress 4096 1024   |    1204MB/s     |
> >> ++++++++++++++++++++++++++++++++++++++++++
> >> | mempress 4096 4096   |    4000Mb/s     |
> >> ++++++++++++++++++++++++++++++++++++++++++
> > 
> > This is quite neat; I know we've got other people who have asked
> > for a similar feature!
> > Have you tried to validate these numbers against a real migration - e.g.
> > try setting mempress to dirty just under 1GByte/s and see if you can
> > migrate it over a 10Gbps link?
> > 
> > Dave
> > 
> Hi, Dave.
> Thank you for your review.
> 
> Note that, the original intention is evaluating dirty rate before migration.

Right, but the reason you want to evaluate the dirty rate is, I guess,
to figure out whether a migration is likely to coverge?

> However, I test dirty rate against a real migration over a bandwidth of 10Gps with various mempress, which shows as below:
> ++++++++++++++++++++++++++++++++++++++++++
> |                      |    dirtyrate    |
> ++++++++++++++++++++++++++++++++++++++++++
> | no mempress          |     8MB/s       |
> ------------------------------------------
> | mempress 4096 1024   |    1188MB/s     |
> ++++++++++++++++++++++++++++++++++++++++++
> 
> It looks still close to actual dirty rate:)

I don't quite understand that comparison you just gave.
But what I was expecting was that a mempress that
just fits teh ~1100MB/s is just the limit of what you can get down
a 10Gbps link.

Dave

> Test results against a real migration will be posted in V2.
> 
> >> Test dirtyrate by qmp command like this:
> >> 1.  virsh qemu-monitor-command [vmname] '{"execute":"cal_dirty_rate", "arguments": {"value": [sampletime]}}'
> >> 2.  virsh qemu-monitor-command [vmname] '{"execute":"get_dirty_rate"}'
> >>
> >> Further test dirtyrate by libvirt api like this:
> >> virsh getdirtyrate [vmname] [sampletime]
> >>
> >> Zheng Chuan (8):
> >>   migration/dirtyrate: Add get_dirtyrate_thread() function
> >>   migration/dirtyrate: Add block_dirty_info to store dirtypage info
> >>   migration/dirtyrate: Add dirtyrate statistics series functions
> >>   migration/dirtyrate: Record hash results for each ramblock
> >>   migration/dirtyrate: Compare hash results for recorded ramblock
> >>   migration/dirtyrate: Implement get_sample_gap_period() and
> >>     block_sample_gap_period()
> >>   migration/dirtyrate: Implement calculate_dirtyrate() function
> >>   migration/dirtyrate: Implement
> >>     qmp_cal_dirty_rate()/qmp_get_dirty_rate() function
> >>
> >>  migration/Makefile.objs |   1 +
> >>  migration/dirtyrate.c   | 424 ++++++++++++++++++++++++++++++++++++++++++++++++
> >>  migration/dirtyrate.h   |  67 ++++++++
> >>  qapi/migration.json     |  24 +++
> >>  qapi/pragma.json        |   3 +-
> >>  5 files changed, 518 insertions(+), 1 deletion(-)
> >>  create mode 100644 migration/dirtyrate.c
> >>  create mode 100644 migration/dirtyrate.h
> >>
> >> -- 
> >> 1.8.3.1
> >>
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > 
> > 
> > .
> > 
>
Zheng Chuan Aug. 7, 2020, 6:13 a.m. UTC | #4
On 2020/8/7 0:58, Dr. David Alan Gilbert wrote:
> * Zheng Chuan (zhengchuan@huawei.com) wrote:
>>
>>
>> On 2020/8/5 0:19, Dr. David Alan Gilbert wrote:
>>> * Chuan Zheng (zhengchuan@huawei.com) wrote:
>>>> From: Zheng Chuan <zhengchuan@huawei.com>
>>>
>>> Hi,
>>>
>>>> Sometimes it is neccessary to evaluate dirty page rate before migration.
>>>> Users could decide whether to proceed migration based on the evaluation
>>>> in case of vm performance loss due to heavy workload.
>>>> Unlikey simulating dirtylog sync which could do harm on runnning vm,
>>>> we provide a sample-hash method to compare hash results for samping page.
>>>> In this way, it would have hardly no impact on vm performance.
>>>>
>>>> We evaluate the dirtypage rate on running vm.
>>>> The VM specifications for migration are as follows:
>>>> - VM use 4-K page;
>>>> - the number of VCPU is 32;
>>>> - the total memory is 32Gigabit;
>>>> - use 'mempress' tool to pressurize VM(mempress 4096 1024);
>>>>
>>>> ++++++++++++++++++++++++++++++++++++++++++
>>>> |                      |    dirtyrate    |
>>>> ++++++++++++++++++++++++++++++++++++++++++
>>>> | no mempress          |     4MB/s       |
>>>> ------------------------------------------
>>>> | mempress 4096 1024   |    1204MB/s     |
>>>> ++++++++++++++++++++++++++++++++++++++++++
>>>> | mempress 4096 4096   |    4000Mb/s     |
>>>> ++++++++++++++++++++++++++++++++++++++++++
>>>
>>> This is quite neat; I know we've got other people who have asked
>>> for a similar feature!
>>> Have you tried to validate these numbers against a real migration - e.g.
>>> try setting mempress to dirty just under 1GByte/s and see if you can
>>> migrate it over a 10Gbps link?
>>>
>>> Dave
>>>
>> Hi, Dave.
>> Thank you for your review.
>>
>> Note that, the original intention is evaluating dirty rate before migration.
> 
> Right, but the reason you want to evaluate the dirty rate is, I guess,
> to figure out whether a migration is likely to coverge?
> 
Yes, in our practice, we use this feature to evaluate dirty rate before migration.
if the dirty rate is too high, users could consider do not migration in case of
migration failure and vm performance.
However, i think it could extend to use at all stages of migration which includes the migrating stage:)

>> However, I test dirty rate against a real migration over a bandwidth of 10Gps with various mempress, which shows as below:
>> ++++++++++++++++++++++++++++++++++++++++++
>> |                      |    dirtyrate    |
>> ++++++++++++++++++++++++++++++++++++++++++
>> | no mempress          |     8MB/s       |
>> ------------------------------------------
>> | mempress 4096 1024   |    1188MB/s     |
>> ++++++++++++++++++++++++++++++++++++++++++
>>
>> It looks still close to actual dirty rate:)
> 
> I don't quite understand that comparison you just gave.
> But what I was expecting was that a mempress that
> just fits teh ~1100MB/s is just the limit of what you can get down
> a 10Gbps link.
> 
> Dave
> 
Well, what i mean is that the comparison between before-migration and migrating stage under 10Gbps link,
the dirty-rate calculating by our method is very close.
We could use this method both at the stage of before-migration and migrating.

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|                   |                          |    dirtyrate       |
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
| before migration  |   mempress 4096 1024     |     1204MB/s       |
--------------------------------------------------------------------
| migrating         |   mempress 4096 1024     |     1188MB/s       |
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

I am not sure if it is the comparison you want.
if not, please let me know and i'll supplement it in V2:)

>> Test results against a real migration will be posted in V2.
>>
>>>> Test dirtyrate by qmp command like this:
>>>> 1.  virsh qemu-monitor-command [vmname] '{"execute":"cal_dirty_rate", "arguments": {"value": [sampletime]}}'
>>>> 2.  virsh qemu-monitor-command [vmname] '{"execute":"get_dirty_rate"}'
>>>>
>>>> Further test dirtyrate by libvirt api like this:
>>>> virsh getdirtyrate [vmname] [sampletime]
>>>>
>>>> Zheng Chuan (8):
>>>>   migration/dirtyrate: Add get_dirtyrate_thread() function
>>>>   migration/dirtyrate: Add block_dirty_info to store dirtypage info
>>>>   migration/dirtyrate: Add dirtyrate statistics series functions
>>>>   migration/dirtyrate: Record hash results for each ramblock
>>>>   migration/dirtyrate: Compare hash results for recorded ramblock
>>>>   migration/dirtyrate: Implement get_sample_gap_period() and
>>>>     block_sample_gap_period()
>>>>   migration/dirtyrate: Implement calculate_dirtyrate() function
>>>>   migration/dirtyrate: Implement
>>>>     qmp_cal_dirty_rate()/qmp_get_dirty_rate() function
>>>>
>>>>  migration/Makefile.objs |   1 +
>>>>  migration/dirtyrate.c   | 424 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>  migration/dirtyrate.h   |  67 ++++++++
>>>>  qapi/migration.json     |  24 +++
>>>>  qapi/pragma.json        |   3 +-
>>>>  5 files changed, 518 insertions(+), 1 deletion(-)
>>>>  create mode 100644 migration/dirtyrate.c
>>>>  create mode 100644 migration/dirtyrate.h
>>>>
>>>> -- 
>>>> 1.8.3.1
>>>>
>>> --
>>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>>>
>>>
>>> .
>>>
>>