diff mbox series

[2/9] vfio/migration: Refactor vfio_devices_all_dirty_tracking() logic

Message ID 20241216094638.26406-3-avihaih@nvidia.com (mailing list archive)
State New
Headers show
Series migration: Drop/unexport migration_is_device() and migration_is_active() | expand

Commit Message

Avihai Horon Dec. 16, 2024, 9:46 a.m. UTC
During dirty page log sync, vfio_devices_all_dirty_tracking() is used to
check if dirty tracking has been started in order to avoid errors. The
current logic checks if migration is in ACTIVE or DEVICE states to
ensure dirty tracking has been started.

However, recently there has been an effort to simplify the migration
status API and reduce it to a single migration_is_running() function.

To accommodate this, refactor vfio_devices_all_dirty_tracking() logic so
it won't use migration_is_active() and migration_is_device(). Instead,
use internal VFIO dirty tracking flags.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
---
 hw/vfio/common.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

Comments

Joao Martins Dec. 16, 2024, 12:32 p.m. UTC | #1
On 16/12/2024 09:46, Avihai Horon wrote:
> During dirty page log sync, vfio_devices_all_dirty_tracking() is used to
> check if dirty tracking has been started in order to avoid errors. The
> current logic checks if migration is in ACTIVE or DEVICE states to
> ensure dirty tracking has been started.
> 
> However, recently there has been an effort to simplify the migration
> status API and reduce it to a single migration_is_running() function.
> 
> To accommodate this, refactor vfio_devices_all_dirty_tracking() logic so
> it won't use migration_is_active() and migration_is_device(). Instead,
> use internal VFIO dirty tracking flags.
> 
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>

The refactor itself is fine except a pre-existing bug:

	Reviewed-by: Joao Martins <joao.m.martins@oracle.com>

> ---
>  hw/vfio/common.c | 21 ++++++++++++++++++++-
>  1 file changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index dcef44fe55..a99796403e 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -170,11 +170,30 @@ bool vfio_device_state_is_precopy(VFIODevice *vbasedev)
>             migration->device_state == VFIO_DEVICE_STATE_PRE_COPY_P2P;
>  }
>  
> +static bool vfio_devices_all_device_dirty_tracking_started(
> +    const VFIOContainerBase *bcontainer)
> +{
> +    VFIODevice *vbasedev;
> +
> +    QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) {
> +        if (!vbasedev->dirty_tracking) {
> +            return false;
> +        }
> +    }
> +
> +    return true;
> +}
> +
>  static bool vfio_devices_all_dirty_tracking(VFIOContainerBase *bcontainer)
>  {
>      VFIODevice *vbasedev;
>  
> -    if (!migration_is_active() && !migration_is_device()) {
> +    if (!migration_is_running()) {
> +        return false;
> +    }
> +

Tieing to migration status means that non-KVM dirty trackers cannot be toggled
unless somebody starts migration. When really your original intention behind
commit ff180c6bd7 ("vfio/migration: Skip log_sync during migration SETUP state")
was to avoid the setup state when you are indeed during a migration.

Now you can actually start/sync/stop dirty trackers without migration when you
use calc-dirty-rate which is immensely useful to draw out how active a VM prior
to starting migration.

The fix is simple and would be to flex the condition to be something like:

	/* Migration status is 'none' with calc-dirty-rate */
	if (!migration_is_none() && !migration_is_running()) {
	    return false;
	}

This is ortoghonal to your series of course, but given you are skimming around
this area, sounded like a good idea to raise this. This patch below is what I
had plan to send when the development window started, but this was before folks
wanted to unexport migration status helpers. What would be the alternative idea
forward?

-------------------->8---------------------

From ace22f29a0547353e4ed5a0db53292a77f79fa81 Mon Sep 17 00:00:00 2001
From: Joao Martins <joao.m.martins@oracle.com>
Date: Wed, 9 Oct 2024 00:27:46 +0100
Subject: [PATCH] vfio/migration: Allow dirty tracking reports with
 MIGRATION_STATUS_NONE

Invoking calc-dirty-rate HMP/QMP method queries the VM dirty rate
without starting a live migration, which is useful e.g. to understand how
active guests are and even for testing purposes. calc-dirty-rate asks
the dirty rate from the VM and it's not restricted to a particular dirty
tracker.

However commit ff180c6bd7 ("vfio/migration: Skip log_sync during migration SETUP
state")
didn't consider this and currently restricts that VF/IOMMU dirty info when migration
is active to allow it to be skipped during SETUP stage.

The vfio dirty tracker is already started, the reports are just skipped
based on migration status. So change vfio_devices_all_dirty_tracking() such
that we include MIGRATION_STATUS_NONE to cover calc-dirty-rate case.

Fixes: ff180c6bd7 ("vfio/migration: Skip log_sync during migration SETUP state")
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
---
 hw/vfio/common.c         | 4 +++-
 include/migration/misc.h | 1 +
 migration/migration.c    | 7 +++++++
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index dcef44fe55be..0c188a2baac2 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -174,7 +174,9 @@ static bool
vfio_devices_all_dirty_tracking(VFIOContainerBase *bcontainer)
 {
     VFIODevice *vbasedev;

-    if (!migration_is_active() && !migration_is_device()) {
+    /* Migration status is 'none' with calc-dirty-rate */
+    if (!migration_is_none() &&
+        !migration_is_active() && !migration_is_device()) {
         return false;
     }

diff --git a/include/migration/misc.h b/include/migration/misc.h
index 804eb23c0607..857768b51383 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -53,6 +53,7 @@ void dump_vmstate_json_to_file(FILE *out_fp);
 void migration_object_init(void);
 void migration_shutdown(void);

+bool migration_is_none(void);
 bool migration_is_active(void);
 bool migration_is_device(void);
 bool migration_is_running(void);
diff --git a/migration/migration.c b/migration/migration.c
index 8c5bd0a75c85..49d11e1adf04 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1637,6 +1637,13 @@ bool migration_in_bg_snapshot(void)
     return migrate_background_snapshot() && migration_is_running();
 }

+bool migration_is_none(void)
+{
+    MigrationState *s = current_migration;
+
+    return s->state == MIGRATION_STATUS_NONE;
+}
+
 bool migration_is_active(void)
 {
     MigrationState *s = current_migration;
--
2.39.3
Joao Martins Dec. 16, 2024, 12:39 p.m. UTC | #2
On 16/12/2024 12:32, Joao Martins wrote:
> On 16/12/2024 09:46, Avihai Horon wrote:
>> During dirty page log sync, vfio_devices_all_dirty_tracking() is used to
>> check if dirty tracking has been started in order to avoid errors. The
>> current logic checks if migration is in ACTIVE or DEVICE states to
>> ensure dirty tracking has been started.
>>
>> However, recently there has been an effort to simplify the migration
>> status API and reduce it to a single migration_is_running() function.
>>
>> To accommodate this, refactor vfio_devices_all_dirty_tracking() logic so
>> it won't use migration_is_active() and migration_is_device(). Instead,
>> use internal VFIO dirty tracking flags.
>>
>> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
> 
> The refactor itself is fine except a pre-existing bug:
> 
> 	Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
> 
>> ---
>>  hw/vfio/common.c | 21 ++++++++++++++++++++-
>>  1 file changed, 20 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index dcef44fe55..a99796403e 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -170,11 +170,30 @@ bool vfio_device_state_is_precopy(VFIODevice *vbasedev)
>>             migration->device_state == VFIO_DEVICE_STATE_PRE_COPY_P2P;
>>  }
>>  
>> +static bool vfio_devices_all_device_dirty_tracking_started(
>> +    const VFIOContainerBase *bcontainer)
>> +{
>> +    VFIODevice *vbasedev;
>> +
>> +    QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) {
>> +        if (!vbasedev->dirty_tracking) {
>> +            return false;
>> +        }
>> +    }
>> +
>> +    return true;
>> +}
>> +
>>  static bool vfio_devices_all_dirty_tracking(VFIOContainerBase *bcontainer)
>>  {
>>      VFIODevice *vbasedev;
>>  
>> -    if (!migration_is_active() && !migration_is_device()) {
>> +    if (!migration_is_running()) {
>> +        return false;
>> +    }
>> +
> 
> Tieing to migration status means that non-KVM dirty trackers cannot be toggled
> unless somebody starts migration. When really your original intention behind
> commit ff180c6bd7 ("vfio/migration: Skip log_sync during migration SETUP state")
> was to avoid the setup state when you are indeed during a migration.
> 
> Now you can actually start/sync/stop dirty trackers without migration when you
> use calc-dirty-rate which is immensely useful to draw out how active a VM prior
> to starting migration.
> 
> The fix is simple and would be to flex the condition to be something like:
> 
> 	/* Migration status is 'none' with calc-dirty-rate */
> 	if (!migration_is_none() && !migration_is_running()) {
> 	    return false;
> 	}
> 
> This is ortoghonal to your series of course, but given you are skimming around
> this area, sounded like a good idea to raise this. This patch below is what I
> had plan to send when the development window started, but this was before folks
> wanted to unexport migration status helpers. What would be the alternative idea
> forward?
> 
> -------------------->8---------------------
> 
> From ace22f29a0547353e4ed5a0db53292a77f79fa81 Mon Sep 17 00:00:00 2001
> From: Joao Martins <joao.m.martins@oracle.com>
> Date: Wed, 9 Oct 2024 00:27:46 +0100
> Subject: [PATCH] vfio/migration: Allow dirty tracking reports with
>  MIGRATION_STATUS_NONE
> 
> Invoking calc-dirty-rate HMP/QMP method queries the VM dirty rate
> without starting a live migration, which is useful e.g. to understand how
> active guests are and even for testing purposes. calc-dirty-rate asks
> the dirty rate from the VM and it's not restricted to a particular dirty
> tracker.
> 
> However commit ff180c6bd7 ("vfio/migration: Skip log_sync during migration SETUP
> state")
> didn't consider this and currently restricts that VF/IOMMU dirty info when migration
> is active to allow it to be skipped during SETUP stage.
> 
> The vfio dirty tracker is already started, the reports are just skipped
> based on migration status. So change vfio_devices_all_dirty_tracking() such
> that we include MIGRATION_STATUS_NONE to cover calc-dirty-rate case.
> 
> Fixes: ff180c6bd7 ("vfio/migration: Skip log_sync during migration SETUP state")
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> ---
>  hw/vfio/common.c         | 4 +++-
>  include/migration/misc.h | 1 +
>  migration/migration.c    | 7 +++++++
>  3 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index dcef44fe55be..0c188a2baac2 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -174,7 +174,9 @@ static bool
> vfio_devices_all_dirty_tracking(VFIOContainerBase *bcontainer)
>  {
>      VFIODevice *vbasedev;
> 
> -    if (!migration_is_active() && !migration_is_device()) {
> +    /* Migration status is 'none' with calc-dirty-rate */
> +    if (!migration_is_none() &&
> +        !migration_is_active() && !migration_is_device()) {
>          return false;
>      }
> 

NB: The patch is obviously incomplete given that I missed the check in
vfio_devices_all_running_and_mig_active()

> diff --git a/include/migration/misc.h b/include/migration/misc.h
> index 804eb23c0607..857768b51383 100644
> --- a/include/migration/misc.h
> +++ b/include/migration/misc.h
> @@ -53,6 +53,7 @@ void dump_vmstate_json_to_file(FILE *out_fp);
>  void migration_object_init(void);
>  void migration_shutdown(void);
> 
> +bool migration_is_none(void);
>  bool migration_is_active(void);
>  bool migration_is_device(void);
>  bool migration_is_running(void);
> diff --git a/migration/migration.c b/migration/migration.c
> index 8c5bd0a75c85..49d11e1adf04 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1637,6 +1637,13 @@ bool migration_in_bg_snapshot(void)
>      return migrate_background_snapshot() && migration_is_running();
>  }
> 
> +bool migration_is_none(void)
> +{
> +    MigrationState *s = current_migration;
> +
> +    return s->state == MIGRATION_STATUS_NONE;
> +}
> +
>  bool migration_is_active(void)
>  {
>      MigrationState *s = current_migration;
> --
> 2.39.3
Avihai Horon Dec. 16, 2024, 2:52 p.m. UTC | #3
On 16/12/2024 14:32, Joao Martins wrote:
> External email: Use caution opening links or attachments
>
>
> On 16/12/2024 09:46, Avihai Horon wrote:
>> During dirty page log sync, vfio_devices_all_dirty_tracking() is used to
>> check if dirty tracking has been started in order to avoid errors. The
>> current logic checks if migration is in ACTIVE or DEVICE states to
>> ensure dirty tracking has been started.
>>
>> However, recently there has been an effort to simplify the migration
>> status API and reduce it to a single migration_is_running() function.
>>
>> To accommodate this, refactor vfio_devices_all_dirty_tracking() logic so
>> it won't use migration_is_active() and migration_is_device(). Instead,
>> use internal VFIO dirty tracking flags.
>>
>> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
> The refactor itself is fine except a pre-existing bug:
>
>          Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
>
>> ---
>>   hw/vfio/common.c | 21 ++++++++++++++++++++-
>>   1 file changed, 20 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index dcef44fe55..a99796403e 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -170,11 +170,30 @@ bool vfio_device_state_is_precopy(VFIODevice *vbasedev)
>>              migration->device_state == VFIO_DEVICE_STATE_PRE_COPY_P2P;
>>   }
>>
>> +static bool vfio_devices_all_device_dirty_tracking_started(
>> +    const VFIOContainerBase *bcontainer)
>> +{
>> +    VFIODevice *vbasedev;
>> +
>> +    QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) {
>> +        if (!vbasedev->dirty_tracking) {
>> +            return false;
>> +        }
>> +    }
>> +
>> +    return true;
>> +}
>> +
>>   static bool vfio_devices_all_dirty_tracking(VFIOContainerBase *bcontainer)
>>   {
>>       VFIODevice *vbasedev;
>>
>> -    if (!migration_is_active() && !migration_is_device()) {
>> +    if (!migration_is_running()) {
>> +        return false;
>> +    }
>> +
> Tieing to migration status means that non-KVM dirty trackers cannot be toggled
> unless somebody starts migration. When really your original intention behind
> commit ff180c6bd7 ("vfio/migration: Skip log_sync during migration SETUP state")
> was to avoid the setup state when you are indeed during a migration.

It was tied to migration even prior to this commit, as VFIO log syncs 
were restricted to run only during migration (we had "if 
(!migration_is_setup_or_active())" check).
This commit only narrowed it down further to not run during SETUP.

>
> Now you can actually start/sync/stop dirty trackers without migration when you
> use calc-dirty-rate which is immensely useful to draw out how active a VM prior
> to starting migration.
>
> The fix is simple and would be to flex the condition to be something like:
>
>          /* Migration status is 'none' with calc-dirty-rate */
>          if (!migration_is_none() && !migration_is_running()) {
>              return false;
>          }
>
> This is ortoghonal to your series of course, but given you are skimming around
> this area, sounded like a good idea to raise this. This patch below is what I
> had plan to send when the development window started, but this was before folks
> wanted to unexport migration status helpers.

I remember you had several patches that formally added VFIO DPT to 
calc-dirty-rate (with a new "-d" QMP parameter and everything).
Are you still planning to send these?

> What would be the alternative idea
> forward?

Now we have an internal VFIO flag to indicate dirty tracking status, so 
that's one thing we can rely on.
And we can also use the global dirty tracking flags in 
include/exec/memory.h:

     /* Possible bits for global_dirty_log_{start|stop} */

     /* Dirty tracking enabled because migration is running */
     #define GLOBAL_DIRTY_MIGRATION  (1U << 0)

     /* Dirty tracking enabled because measuring dirty rate */
     #define GLOBAL_DIRTY_DIRTY_RATE (1U << 1)

     /* Dirty tracking enabled because dirty limit */
     #define GLOBAL_DIRTY_LIMIT      (1U << 2)

     #define GLOBAL_DIRTY_MASK  (0x7)

     extern unsigned int global_dirty_tracking;

So I guess we can add some helpers to access global_dirty_tracking and 
use them in VFIO to decide when to allow log sync.

But as you wrote, I think that's orthogonal to this series.

Thanks.

>
> -------------------->8---------------------
>
>  From ace22f29a0547353e4ed5a0db53292a77f79fa81 Mon Sep 17 00:00:00 2001
> From: Joao Martins <joao.m.martins@oracle.com>
> Date: Wed, 9 Oct 2024 00:27:46 +0100
> Subject: [PATCH] vfio/migration: Allow dirty tracking reports with
>   MIGRATION_STATUS_NONE
>
> Invoking calc-dirty-rate HMP/QMP method queries the VM dirty rate
> without starting a live migration, which is useful e.g. to understand how
> active guests are and even for testing purposes. calc-dirty-rate asks
> the dirty rate from the VM and it's not restricted to a particular dirty
> tracker.
>
> However commit ff180c6bd7 ("vfio/migration: Skip log_sync during migration SETUP
> state")
> didn't consider this and currently restricts that VF/IOMMU dirty info when migration
> is active to allow it to be skipped during SETUP stage.
>
> The vfio dirty tracker is already started, the reports are just skipped
> based on migration status. So change vfio_devices_all_dirty_tracking() such
> that we include MIGRATION_STATUS_NONE to cover calc-dirty-rate case.
>
> Fixes: ff180c6bd7 ("vfio/migration: Skip log_sync during migration SETUP state")
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> ---
>   hw/vfio/common.c         | 4 +++-
>   include/migration/misc.h | 1 +
>   migration/migration.c    | 7 +++++++
>   3 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index dcef44fe55be..0c188a2baac2 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -174,7 +174,9 @@ static bool
> vfio_devices_all_dirty_tracking(VFIOContainerBase *bcontainer)
>   {
>       VFIODevice *vbasedev;
>
> -    if (!migration_is_active() && !migration_is_device()) {
> +    /* Migration status is 'none' with calc-dirty-rate */
> +    if (!migration_is_none() &&
> +        !migration_is_active() && !migration_is_device()) {
>           return false;
>       }
>
> diff --git a/include/migration/misc.h b/include/migration/misc.h
> index 804eb23c0607..857768b51383 100644
> --- a/include/migration/misc.h
> +++ b/include/migration/misc.h
> @@ -53,6 +53,7 @@ void dump_vmstate_json_to_file(FILE *out_fp);
>   void migration_object_init(void);
>   void migration_shutdown(void);
>
> +bool migration_is_none(void);
>   bool migration_is_active(void);
>   bool migration_is_device(void);
>   bool migration_is_running(void);
> diff --git a/migration/migration.c b/migration/migration.c
> index 8c5bd0a75c85..49d11e1adf04 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1637,6 +1637,13 @@ bool migration_in_bg_snapshot(void)
>       return migrate_background_snapshot() && migration_is_running();
>   }
>
> +bool migration_is_none(void)
> +{
> +    MigrationState *s = current_migration;
> +
> +    return s->state == MIGRATION_STATUS_NONE;
> +}
> +
>   bool migration_is_active(void)
>   {
>       MigrationState *s = current_migration;
> --
> 2.39.3
Joao Martins Dec. 16, 2024, 3:52 p.m. UTC | #4
On 16/12/2024 14:52, Avihai Horon wrote:
> 
> On 16/12/2024 14:32, Joao Martins wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> On 16/12/2024 09:46, Avihai Horon wrote:
>>> During dirty page log sync, vfio_devices_all_dirty_tracking() is used to
>>> check if dirty tracking has been started in order to avoid errors. The
>>> current logic checks if migration is in ACTIVE or DEVICE states to
>>> ensure dirty tracking has been started.
>>>
>>> However, recently there has been an effort to simplify the migration
>>> status API and reduce it to a single migration_is_running() function.
>>>
>>> To accommodate this, refactor vfio_devices_all_dirty_tracking() logic so
>>> it won't use migration_is_active() and migration_is_device(). Instead,
>>> use internal VFIO dirty tracking flags.
>>>
>>> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
>> The refactor itself is fine except a pre-existing bug:
>>
>>          Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
>>
>>> ---
>>>   hw/vfio/common.c | 21 ++++++++++++++++++++-
>>>   1 file changed, 20 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>>> index dcef44fe55..a99796403e 100644
>>> --- a/hw/vfio/common.c
>>> +++ b/hw/vfio/common.c
>>> @@ -170,11 +170,30 @@ bool vfio_device_state_is_precopy(VFIODevice *vbasedev)
>>>              migration->device_state == VFIO_DEVICE_STATE_PRE_COPY_P2P;
>>>   }
>>>
>>> +static bool vfio_devices_all_device_dirty_tracking_started(
>>> +    const VFIOContainerBase *bcontainer)
>>> +{
>>> +    VFIODevice *vbasedev;
>>> +
>>> +    QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) {
>>> +        if (!vbasedev->dirty_tracking) {
>>> +            return false;
>>> +        }
>>> +    }
>>> +
>>> +    return true;
>>> +}
>>> +
>>>   static bool vfio_devices_all_dirty_tracking(VFIOContainerBase *bcontainer)
>>>   {
>>>       VFIODevice *vbasedev;
>>>
>>> -    if (!migration_is_active() && !migration_is_device()) {
>>> +    if (!migration_is_running()) {
>>> +        return false;
>>> +    }
>>> +
>> Tieing to migration status means that non-KVM dirty trackers cannot be toggled
>> unless somebody starts migration. When really your original intention behind
>> commit ff180c6bd7 ("vfio/migration: Skip log_sync during migration SETUP state")
>> was to avoid the setup state when you are indeed during a migration.
> 
> It was tied to migration even prior to this commit, as VFIO log syncs were
> restricted to run only during migration (we had "if (!
> migration_is_setup_or_active())" check).
> This commit only narrowed it down further to not run during SETUP.
> 

Ok, good point.

Btw you are regressing from that behaviour with this change above, because if
migration has state MIGRATION_STATUS_SETUP and migration_is_running() will
return true and so you will log dirty pages.

>>
>> Now you can actually start/sync/stop dirty trackers without migration when you
>> use calc-dirty-rate which is immensely useful to draw out how active a VM prior
>> to starting migration.
>>
>> The fix is simple and would be to flex the condition to be something like:
>>
>>          /* Migration status is 'none' with calc-dirty-rate */
>>          if (!migration_is_none() && !migration_is_running()) {
>>              return false;
>>          }
>>
>> This is ortoghonal to your series of course, but given you are skimming around
>> this area, sounded like a good idea to raise this. This patch below is what I
>> had plan to send when the development window started, but this was before folks
>> wanted to unexport migration status helpers.
> 
> I remember you had several patches that formally added VFIO DPT to calc-dirty-
> rate (with a new "-d" QMP parameter and everything).
> Are you still planning to send these?
> 

calc-dirty-rate *implicitly* just logs KVM pages, but in theory it should be
ortoghonal to any dirty tracker that is able to log pages. So in that line of
thought it should be logging pages from all dirty trackers in use.

To actually include all dirty tracking data and while fix the performance issue
to not account for setup migration state then the check is a simple condition
for SETUP state *if* where's a migration started *or* no migration started at all.

I don't think the VF dirty trackers steer the data based on migration -- VFIO
seems to be the only one IIUC. Hence fixing calc-dirty-rate seemed more accurate
in my point of view, and optionally we could restrict scope of dirty tracking as
a bonus

>> What would be the alternative idea
>> forward?
> 
> Now we have an internal VFIO flag to indicate dirty tracking status, so that's
> one thing we can rely on.
> And we can also use the global dirty tracking flags in include/exec/memory.h:
> 
>     /* Possible bits for global_dirty_log_{start|stop} */
> 
>     /* Dirty tracking enabled because migration is running */
>     #define GLOBAL_DIRTY_MIGRATION  (1U << 0)
> 
>     /* Dirty tracking enabled because measuring dirty rate */
>     #define GLOBAL_DIRTY_DIRTY_RATE (1U << 1)
> 
>     /* Dirty tracking enabled because dirty limit */
>     #define GLOBAL_DIRTY_LIMIT      (1U << 2)
> 
>     #define GLOBAL_DIRTY_MASK  (0x7)
> 
>     extern unsigned int global_dirty_tracking;
> 
> So I guess we can add some helpers to access global_dirty_tracking and use them
> in VFIO to decide when to allow log sync.
> 
> But as you wrote, I think that's orthogonal to this series.
> 
Right I also had sketched it like this to reduce the scope of dirty tracking.

This problem preceeds your setup fix so don't wanna go offtopic. Anyway just
wanted to understanding how migration status is going to be exported to see
what's the way forward for that.

> Thanks.
> 
>>
>> -------------------->8---------------------
>>
>>  From ace22f29a0547353e4ed5a0db53292a77f79fa81 Mon Sep 17 00:00:00 2001
>> From: Joao Martins <joao.m.martins@oracle.com>
>> Date: Wed, 9 Oct 2024 00:27:46 +0100
>> Subject: [PATCH] vfio/migration: Allow dirty tracking reports with
>>   MIGRATION_STATUS_NONE
>>
>> Invoking calc-dirty-rate HMP/QMP method queries the VM dirty rate
>> without starting a live migration, which is useful e.g. to understand how
>> active guests are and even for testing purposes. calc-dirty-rate asks
>> the dirty rate from the VM and it's not restricted to a particular dirty
>> tracker.
>>
>> However commit ff180c6bd7 ("vfio/migration: Skip log_sync during migration SETUP
>> state")
>> didn't consider this and currently restricts that VF/IOMMU dirty info when
>> migration
>> is active to allow it to be skipped during SETUP stage.
>>
>> The vfio dirty tracker is already started, the reports are just skipped
>> based on migration status. So change vfio_devices_all_dirty_tracking() such
>> that we include MIGRATION_STATUS_NONE to cover calc-dirty-rate case.
>>
>> Fixes: ff180c6bd7 ("vfio/migration: Skip log_sync during migration SETUP state")
>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>> ---
>>   hw/vfio/common.c         | 4 +++-
>>   include/migration/misc.h | 1 +
>>   migration/migration.c    | 7 +++++++
>>   3 files changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index dcef44fe55be..0c188a2baac2 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -174,7 +174,9 @@ static bool
>> vfio_devices_all_dirty_tracking(VFIOContainerBase *bcontainer)
>>   {
>>       VFIODevice *vbasedev;
>>
>> -    if (!migration_is_active() && !migration_is_device()) {
>> +    /* Migration status is 'none' with calc-dirty-rate */
>> +    if (!migration_is_none() &&
>> +        !migration_is_active() && !migration_is_device()) {
>>           return false;
>>       }
>>
>> diff --git a/include/migration/misc.h b/include/migration/misc.h
>> index 804eb23c0607..857768b51383 100644
>> --- a/include/migration/misc.h
>> +++ b/include/migration/misc.h
>> @@ -53,6 +53,7 @@ void dump_vmstate_json_to_file(FILE *out_fp);
>>   void migration_object_init(void);
>>   void migration_shutdown(void);
>>
>> +bool migration_is_none(void);
>>   bool migration_is_active(void);
>>   bool migration_is_device(void);
>>   bool migration_is_running(void);
>> diff --git a/migration/migration.c b/migration/migration.c
>> index 8c5bd0a75c85..49d11e1adf04 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -1637,6 +1637,13 @@ bool migration_in_bg_snapshot(void)
>>       return migrate_background_snapshot() && migration_is_running();
>>   }
>>
>> +bool migration_is_none(void)
>> +{
>> +    MigrationState *s = current_migration;
>> +
>> +    return s->state == MIGRATION_STATUS_NONE;
>> +}
>> +
>>   bool migration_is_active(void)
>>   {
>>       MigrationState *s = current_migration;
>> -- 
>> 2.39.3
Joao Martins Dec. 16, 2024, 4:05 p.m. UTC | #5
On 16/12/2024 15:52, Joao Martins wrote:
> On 16/12/2024 14:52, Avihai Horon wrote:
>>
>> On 16/12/2024 14:32, Joao Martins wrote:
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> On 16/12/2024 09:46, Avihai Horon wrote:
>>>> During dirty page log sync, vfio_devices_all_dirty_tracking() is used to
>>>> check if dirty tracking has been started in order to avoid errors. The
>>>> current logic checks if migration is in ACTIVE or DEVICE states to
>>>> ensure dirty tracking has been started.
>>>>
>>>> However, recently there has been an effort to simplify the migration
>>>> status API and reduce it to a single migration_is_running() function.
>>>>
>>>> To accommodate this, refactor vfio_devices_all_dirty_tracking() logic so
>>>> it won't use migration_is_active() and migration_is_device(). Instead,
>>>> use internal VFIO dirty tracking flags.
>>>>
>>>> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
>>> The refactor itself is fine except a pre-existing bug:
>>>
>>>          Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
>>>
>>>> ---
>>>>   hw/vfio/common.c | 21 ++++++++++++++++++++-
>>>>   1 file changed, 20 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>>>> index dcef44fe55..a99796403e 100644
>>>> --- a/hw/vfio/common.c
>>>> +++ b/hw/vfio/common.c
>>>> @@ -170,11 +170,30 @@ bool vfio_device_state_is_precopy(VFIODevice *vbasedev)
>>>>              migration->device_state == VFIO_DEVICE_STATE_PRE_COPY_P2P;
>>>>   }
>>>>
>>>> +static bool vfio_devices_all_device_dirty_tracking_started(
>>>> +    const VFIOContainerBase *bcontainer)
>>>> +{
>>>> +    VFIODevice *vbasedev;
>>>> +
>>>> +    QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) {
>>>> +        if (!vbasedev->dirty_tracking) {
>>>> +            return false;
>>>> +        }
>>>> +    }
>>>> +
>>>> +    return true;
>>>> +}
>>>> +
>>>>   static bool vfio_devices_all_dirty_tracking(VFIOContainerBase *bcontainer)
>>>>   {
>>>>       VFIODevice *vbasedev;
>>>>
>>>> -    if (!migration_is_active() && !migration_is_device()) {
>>>> +    if (!migration_is_running()) {
>>>> +        return false;
>>>> +    }
>>>> +
>>> Tieing to migration status means that non-KVM dirty trackers cannot be toggled
>>> unless somebody starts migration. When really your original intention behind
>>> commit ff180c6bd7 ("vfio/migration: Skip log_sync during migration SETUP state")
>>> was to avoid the setup state when you are indeed during a migration.
>>
>> It was tied to migration even prior to this commit, as VFIO log syncs were
>> restricted to run only during migration (we had "if (!
>> migration_is_setup_or_active())" check).
>> This commit only narrowed it down further to not run during SETUP.
>>
> 
> Ok, good point.
> 
> Btw you are regressing from that behaviour with this change above, because if
> migration has state MIGRATION_STATUS_SETUP and migration_is_running() will
> return true and so you will log dirty pages.
> 

Nevermind this comment.

Just noticed that it was the point of the whole thread:

https://lore.kernel.org/qemu-devel/78729b4b-3747-4408-8146-12d49e70fed1@nvidia.com/#r

... where you discuss this.
Joao Martins Dec. 16, 2024, 7:53 p.m. UTC | #6
On 16/12/2024 16:05, Joao Martins wrote:
> On 16/12/2024 15:52, Joao Martins wrote:
>> On 16/12/2024 14:52, Avihai Horon wrote:
>>>
>>> On 16/12/2024 14:32, Joao Martins wrote:
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> On 16/12/2024 09:46, Avihai Horon wrote:
>>>>> During dirty page log sync, vfio_devices_all_dirty_tracking() is used to
>>>>> check if dirty tracking has been started in order to avoid errors. The
>>>>> current logic checks if migration is in ACTIVE or DEVICE states to
>>>>> ensure dirty tracking has been started.
>>>>>
>>>>> However, recently there has been an effort to simplify the migration
>>>>> status API and reduce it to a single migration_is_running() function.
>>>>>
>>>>> To accommodate this, refactor vfio_devices_all_dirty_tracking() logic so
>>>>> it won't use migration_is_active() and migration_is_device(). Instead,
>>>>> use internal VFIO dirty tracking flags.
>>>>>
>>>>> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
>>>> The refactor itself is fine except a pre-existing bug:
>>>>
>>>>          Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
>>>>
>>>>> ---
>>>>>   hw/vfio/common.c | 21 ++++++++++++++++++++-
>>>>>   1 file changed, 20 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>>>>> index dcef44fe55..a99796403e 100644
>>>>> --- a/hw/vfio/common.c
>>>>> +++ b/hw/vfio/common.c
>>>>> @@ -170,11 +170,30 @@ bool vfio_device_state_is_precopy(VFIODevice *vbasedev)
>>>>>              migration->device_state == VFIO_DEVICE_STATE_PRE_COPY_P2P;
>>>>>   }
>>>>>
>>>>> +static bool vfio_devices_all_device_dirty_tracking_started(
>>>>> +    const VFIOContainerBase *bcontainer)
>>>>> +{
>>>>> +    VFIODevice *vbasedev;
>>>>> +
>>>>> +    QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) {
>>>>> +        if (!vbasedev->dirty_tracking) {
>>>>> +            return false;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    return true;
>>>>> +}
>>>>> +
>>>>>   static bool vfio_devices_all_dirty_tracking(VFIOContainerBase *bcontainer)
>>>>>   {
>>>>>       VFIODevice *vbasedev;
>>>>>
>>>>> -    if (!migration_is_active() && !migration_is_device()) {
>>>>> +    if (!migration_is_running()) {
>>>>> +        return false;
>>>>> +    }
>>>>> +
>>>> Tieing to migration status means that non-KVM dirty trackers cannot be toggled
>>>> unless somebody starts migration. When really your original intention behind
>>>> commit ff180c6bd7 ("vfio/migration: Skip log_sync during migration SETUP state")
>>>> was to avoid the setup state when you are indeed during a migration.
>>>
>>> It was tied to migration even prior to this commit, as VFIO log syncs were
>>> restricted to run only during migration (we had "if (!
>>> migration_is_setup_or_active())" check).
>>> This commit only narrowed it down further to not run during SETUP.
>>>
>>
>> Ok, good point.
>>
>> Btw you are regressing from that behaviour with this change above, because if
>> migration has state MIGRATION_STATUS_SETUP and migration_is_running() will
>> return true and so you will log dirty pages.
>>
> 
> Nevermind this comment.
> 
> Just noticed that it was the point of the whole thread:
> 
> https://lore.kernel.org/qemu-devel/78729b4b-3747-4408-8146-12d49e70fed1@nvidia.com/#r
> 
> ... where you discuss this.
> 

Actually, from the thread quoted in the cover letter[0], at the end of this
series we rely on status of dirty tracking to go ahead with the syncs instead of
relying on migration running or not.
Hence just removing migration_is_running() calls wouldn't regress the bug you
fixed with commit ff180c6bd7, neither calc-dirty-rate would miss dirty data from
its reports.

[0]
https://lore.kernel.org/qemu-devel/58146556-d3fa-4d8b-a1db-9bdc68168c78@nvidia.com/
Avihai Horon Dec. 17, 2024, 9:31 a.m. UTC | #7
On 16/12/2024 21:53, Joao Martins wrote:
> External email: Use caution opening links or attachments
>
>
> On 16/12/2024 16:05, Joao Martins wrote:
>> On 16/12/2024 15:52, Joao Martins wrote:
>>> On 16/12/2024 14:52, Avihai Horon wrote:
>>>> On 16/12/2024 14:32, Joao Martins wrote:
>>>>> External email: Use caution opening links or attachments
>>>>>
>>>>>
>>>>> On 16/12/2024 09:46, Avihai Horon wrote:
>>>>>> During dirty page log sync, vfio_devices_all_dirty_tracking() is used to
>>>>>> check if dirty tracking has been started in order to avoid errors. The
>>>>>> current logic checks if migration is in ACTIVE or DEVICE states to
>>>>>> ensure dirty tracking has been started.
>>>>>>
>>>>>> However, recently there has been an effort to simplify the migration
>>>>>> status API and reduce it to a single migration_is_running() function.
>>>>>>
>>>>>> To accommodate this, refactor vfio_devices_all_dirty_tracking() logic so
>>>>>> it won't use migration_is_active() and migration_is_device(). Instead,
>>>>>> use internal VFIO dirty tracking flags.
>>>>>>
>>>>>> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
>>>>> The refactor itself is fine except a pre-existing bug:
>>>>>
>>>>>           Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
>>>>>
>>>>>> ---
>>>>>>    hw/vfio/common.c | 21 ++++++++++++++++++++-
>>>>>>    1 file changed, 20 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>>>>>> index dcef44fe55..a99796403e 100644
>>>>>> --- a/hw/vfio/common.c
>>>>>> +++ b/hw/vfio/common.c
>>>>>> @@ -170,11 +170,30 @@ bool vfio_device_state_is_precopy(VFIODevice *vbasedev)
>>>>>>               migration->device_state == VFIO_DEVICE_STATE_PRE_COPY_P2P;
>>>>>>    }
>>>>>>
>>>>>> +static bool vfio_devices_all_device_dirty_tracking_started(
>>>>>> +    const VFIOContainerBase *bcontainer)
>>>>>> +{
>>>>>> +    VFIODevice *vbasedev;
>>>>>> +
>>>>>> +    QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) {
>>>>>> +        if (!vbasedev->dirty_tracking) {
>>>>>> +            return false;
>>>>>> +        }
>>>>>> +    }
>>>>>> +
>>>>>> +    return true;
>>>>>> +}
>>>>>> +
>>>>>>    static bool vfio_devices_all_dirty_tracking(VFIOContainerBase *bcontainer)
>>>>>>    {
>>>>>>        VFIODevice *vbasedev;
>>>>>>
>>>>>> -    if (!migration_is_active() && !migration_is_device()) {
>>>>>> +    if (!migration_is_running()) {
>>>>>> +        return false;
>>>>>> +    }
>>>>>> +
>>>>> Tieing to migration status means that non-KVM dirty trackers cannot be toggled
>>>>> unless somebody starts migration. When really your original intention behind
>>>>> commit ff180c6bd7 ("vfio/migration: Skip log_sync during migration SETUP state")
>>>>> was to avoid the setup state when you are indeed during a migration.
>>>> It was tied to migration even prior to this commit, as VFIO log syncs were
>>>> restricted to run only during migration (we had "if (!
>>>> migration_is_setup_or_active())" check).
>>>> This commit only narrowed it down further to not run during SETUP.
>>>>
>>> Ok, good point.
>>>
>>> Btw you are regressing from that behaviour with this change above, because if
>>> migration has state MIGRATION_STATUS_SETUP and migration_is_running() will
>>> return true and so you will log dirty pages.
>>>
>> Nevermind this comment.
>>
>> Just noticed that it was the point of the whole thread:
>>
>> https://lore.kernel.org/qemu-devel/78729b4b-3747-4408-8146-12d49e70fed1@nvidia.com/#r
>>
>> ... where you discuss this.
>>
> Actually, from the thread quoted in the cover letter[0], at the end of this
> series we rely on status of dirty tracking to go ahead with the syncs instead of
> relying on migration running or not.
> Hence just removing migration_is_running() calls wouldn't regress the bug you
> fixed with commit ff180c6bd7, neither calc-dirty-rate would miss dirty data from
> its reports.

Right.
But since it was previously tied to migration and because I remembered 
you had the calc-dirty-rate series that also added the -d param to 
control the scope of dirty rate, I thought I should only stick to 
refactoring for the migration status API simplification and that the 
calc-dirty-rate series would come after.

But if we simply want to include VFIO DPT in calc-dirty-rate, then yes, 
I can untie it from migration in v2.

> [0]
> https://lore.kernel.org/qemu-devel/58146556-d3fa-4d8b-a1db-9bdc68168c78@nvidia.com/
diff mbox series

Patch

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index dcef44fe55..a99796403e 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -170,11 +170,30 @@  bool vfio_device_state_is_precopy(VFIODevice *vbasedev)
            migration->device_state == VFIO_DEVICE_STATE_PRE_COPY_P2P;
 }
 
+static bool vfio_devices_all_device_dirty_tracking_started(
+    const VFIOContainerBase *bcontainer)
+{
+    VFIODevice *vbasedev;
+
+    QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) {
+        if (!vbasedev->dirty_tracking) {
+            return false;
+        }
+    }
+
+    return true;
+}
+
 static bool vfio_devices_all_dirty_tracking(VFIOContainerBase *bcontainer)
 {
     VFIODevice *vbasedev;
 
-    if (!migration_is_active() && !migration_is_device()) {
+    if (!migration_is_running()) {
+        return false;
+    }
+
+    if (!(vfio_devices_all_device_dirty_tracking_started(bcontainer) ||
+          bcontainer->dirty_pages_started)) {
         return false;
     }