mbox series

[RFC,0/3] Add migration support for VFIO PCI devices in SMMUv3 nested stage mode

Message ID 20210219094230.231-1-jiangkunkun@huawei.com (mailing list archive)
Headers show
Series Add migration support for VFIO PCI devices in SMMUv3 nested stage mode | expand

Message

Kunkun Jiang Feb. 19, 2021, 9:42 a.m. UTC
Hi all,

Since the SMMUv3's nested translation stages[1] has been introduced by Eric, we
need to pay attention to the migration of VFIO PCI devices in SMMUv3 nested stage
mode. At present, it is not yet supported in QEMU. There are two problems in the
existing framework.

First, the current way to get dirty pages is not applicable to nested stage mode.
Because of the "Caching Mode", VTD can map the RAM through the host single stage
(giova->hpa). "vfio_listener_log_sync" gets dirty pages by transferring "giova"
to the kernel for the RAM block section of mapped MMIO region. In nested stage
mode, we setup the stage 2 (gpa->hpa) and the stage 1(giova->gpa) separately. So
it is inapplicable to get dirty pages by the current way in nested stage mode.

Second, it also need to pass stage 1 configurations to the destination host after
the migration. In Eric's patch, it passes the stage 1 configuration to the host on
each STE update for the devices set the PASID PciOps. The configuration will be
applied at physical level. But the data of physical level will not be sent to the
destination host. So we have to pass stage 1 configurations to the destination
host after the migration.

This Patch set includes patches as below:
Patch 1-2:
- Refactor the vfio_listener_log_sync and added a new function to get dirty pages
in nested stage mode.

Patch 3:
- Added the post_load function to vmstate_smmuv3 for passing stage 1 configuration
to the destination host after the migration.

@Eric, Could you please add this Patch set to your future version of
"vSMMUv3/pSMMUv3 2 stage VFIO integration", if you think this Patch set makes sense? :)

Best Regards
Kunkun Jiang

[1] [RFC,v7,00/26] vSMMUv3/pSMMUv3 2 stage VFIO integration
http://patchwork.ozlabs.org/project/qemu-devel/cover/20201116181349.11908-1-eric.auger@redhat.com/

Kunkun Jiang (3):
  vfio: Introduce helpers to mark dirty pages of a RAM section
  vfio: Add vfio_prereg_listener_log_sync in nested stage
  hw/arm/smmuv3: Post-load stage 1 configurations to the host

 hw/arm/smmuv3.c     | 60 +++++++++++++++++++++++++++++++++++++++++++++
 hw/arm/trace-events |  1 +
 hw/vfio/common.c    | 47 +++++++++++++++++++++++++++++------
 3 files changed, 100 insertions(+), 8 deletions(-)

Comments

Kunkun Jiang March 1, 2021, 8:27 a.m. UTC | #1
kindly ping,
Any comments and reviews are welcome.

Thanks.
Kunkun Jiang.

On 2021/2/19 17:42, Kunkun Jiang wrote:
> Hi all,
>
> Since the SMMUv3's nested translation stages[1] has been introduced by Eric, we
> need to pay attention to the migration of VFIO PCI devices in SMMUv3 nested stage
> mode. At present, it is not yet supported in QEMU. There are two problems in the
> existing framework.
>
> First, the current way to get dirty pages is not applicable to nested stage mode.
> Because of the "Caching Mode", VTD can map the RAM through the host single stage
> (giova->hpa). "vfio_listener_log_sync" gets dirty pages by transferring "giova"
> to the kernel for the RAM block section of mapped MMIO region. In nested stage
> mode, we setup the stage 2 (gpa->hpa) and the stage 1(giova->gpa) separately. So
> it is inapplicable to get dirty pages by the current way in nested stage mode.
>
> Second, it also need to pass stage 1 configurations to the destination host after
> the migration. In Eric's patch, it passes the stage 1 configuration to the host on
> each STE update for the devices set the PASID PciOps. The configuration will be
> applied at physical level. But the data of physical level will not be sent to the
> destination host. So we have to pass stage 1 configurations to the destination
> host after the migration.
>
> This Patch set includes patches as below:
> Patch 1-2:
> - Refactor the vfio_listener_log_sync and added a new function to get dirty pages
> in nested stage mode.
>
> Patch 3:
> - Added the post_load function to vmstate_smmuv3 for passing stage 1 configuration
> to the destination host after the migration.
>
> @Eric, Could you please add this Patch set to your future version of
> "vSMMUv3/pSMMUv3 2 stage VFIO integration", if you think this Patch set makes sense? :)
>
> Best Regards
> Kunkun Jiang
>
> [1] [RFC,v7,00/26] vSMMUv3/pSMMUv3 2 stage VFIO integration
> http://patchwork.ozlabs.org/project/qemu-devel/cover/20201116181349.11908-1-eric.auger@redhat.com/
>
> Kunkun Jiang (3):
>    vfio: Introduce helpers to mark dirty pages of a RAM section
>    vfio: Add vfio_prereg_listener_log_sync in nested stage
>    hw/arm/smmuv3: Post-load stage 1 configurations to the host
>
>   hw/arm/smmuv3.c     | 60 +++++++++++++++++++++++++++++++++++++++++++++
>   hw/arm/trace-events |  1 +
>   hw/vfio/common.c    | 47 +++++++++++++++++++++++++++++------
>   3 files changed, 100 insertions(+), 8 deletions(-)
>
Eric Auger April 12, 2021, 8:40 a.m. UTC | #2
Hi Kunkun,

On 2/19/21 10:42 AM, Kunkun Jiang wrote:
> Hi all,
> 
> Since the SMMUv3's nested translation stages[1] has been introduced by Eric, we
> need to pay attention to the migration of VFIO PCI devices in SMMUv3 nested stage
> mode. At present, it is not yet supported in QEMU. There are two problems in the
> existing framework.
> 
> First, the current way to get dirty pages is not applicable to nested stage mode.
> Because of the "Caching Mode", VTD can map the RAM through the host single stage
> (giova->hpa). "vfio_listener_log_sync" gets dirty pages by transferring "giova"
> to the kernel for the RAM block section of mapped MMIO region. In nested stage
> mode, we setup the stage 2 (gpa->hpa) and the stage 1(giova->gpa) separately. So
> it is inapplicable to get dirty pages by the current way in nested stage mode.
> 
> Second, it also need to pass stage 1 configurations to the destination host after
> the migration. In Eric's patch, it passes the stage 1 configuration to the host on
> each STE update for the devices set the PASID PciOps. The configuration will be
> applied at physical level. But the data of physical level will not be sent to the
> destination host. So we have to pass stage 1 configurations to the destination
> host after the migration.
> 
> This Patch set includes patches as below:
> Patch 1-2:
> - Refactor the vfio_listener_log_sync and added a new function to get dirty pages
> in nested stage mode.
> 
> Patch 3:
> - Added the post_load function to vmstate_smmuv3 for passing stage 1 configuration
> to the destination host after the migration.
> 
> @Eric, Could you please add this Patch set to your future version of
> "vSMMUv3/pSMMUv3 2 stage VFIO integration", if you think this Patch set makes sense? :)
First of all, thank you for working on this. As you may have noticed I
sent a new RFC version yesterday (without including this). When time
allows, you may have a look at the comments I posted on your series. I
don't think I can test it at the moment so I may prefer to keep it
separate. Also be aware that the QEMU integration of nested has not
received much comments yet and is likely to evolve. The priority is to
get some R-b's on the kernel pieces, especially the SMMU part. With this
dependency resolved, things can't move forward I am afraid.

Thanks

Eric
> 
> Best Regards
> Kunkun Jiang
> 
> [1] [RFC,v7,00/26] vSMMUv3/pSMMUv3 2 stage VFIO integration
> http://patchwork.ozlabs.org/project/qemu-devel/cover/20201116181349.11908-1-eric.auger@redhat.com/
> 
> Kunkun Jiang (3):
>   vfio: Introduce helpers to mark dirty pages of a RAM section
>   vfio: Add vfio_prereg_listener_log_sync in nested stage
>   hw/arm/smmuv3: Post-load stage 1 configurations to the host
> 
>  hw/arm/smmuv3.c     | 60 +++++++++++++++++++++++++++++++++++++++++++++
>  hw/arm/trace-events |  1 +
>  hw/vfio/common.c    | 47 +++++++++++++++++++++++++++++------
>  3 files changed, 100 insertions(+), 8 deletions(-)
>
Kunkun Jiang April 12, 2021, 1:46 p.m. UTC | #3
Hi Eric,

On 2021/4/12 16:40, Auger Eric wrote:
> Hi Kunkun,
>
> On 2/19/21 10:42 AM, Kunkun Jiang wrote:
>> Hi all,
>>
>> Since the SMMUv3's nested translation stages[1] has been introduced by Eric, we
>> need to pay attention to the migration of VFIO PCI devices in SMMUv3 nested stage
>> mode. At present, it is not yet supported in QEMU. There are two problems in the
>> existing framework.
>>
>> First, the current way to get dirty pages is not applicable to nested stage mode.
>> Because of the "Caching Mode", VTD can map the RAM through the host single stage
>> (giova->hpa). "vfio_listener_log_sync" gets dirty pages by transferring "giova"
>> to the kernel for the RAM block section of mapped MMIO region. In nested stage
>> mode, we setup the stage 2 (gpa->hpa) and the stage 1(giova->gpa) separately. So
>> it is inapplicable to get dirty pages by the current way in nested stage mode.
>>
>> Second, it also need to pass stage 1 configurations to the destination host after
>> the migration. In Eric's patch, it passes the stage 1 configuration to the host on
>> each STE update for the devices set the PASID PciOps. The configuration will be
>> applied at physical level. But the data of physical level will not be sent to the
>> destination host. So we have to pass stage 1 configurations to the destination
>> host after the migration.
>>
>> This Patch set includes patches as below:
>> Patch 1-2:
>> - Refactor the vfio_listener_log_sync and added a new function to get dirty pages
>> in nested stage mode.
>>
>> Patch 3:
>> - Added the post_load function to vmstate_smmuv3 for passing stage 1 configuration
>> to the destination host after the migration.
>>
>> @Eric, Could you please add this Patch set to your future version of
>> "vSMMUv3/pSMMUv3 2 stage VFIO integration", if you think this Patch set makes sense? :)
> First of all, thank you for working on this. As you may have noticed I
> sent a new RFC version yesterday (without including this). When time
> allows, you may have a look at the comments I posted on your series. I
> don't think I can test it at the moment so I may prefer to keep it
> separate. Also be aware that the QEMU integration of nested has not
> received much comments yet and is likely to evolve. The priority is to
> get some R-b's on the kernel pieces, especially the SMMU part. With this
> dependency resolved, things can't move forward I am afraid.
>
> Thanks
>
> Eric
Yes, I saw your latest version and comments. Thanks again for your review.

I will try my best to test your patches and come up with some useful ideas.
In the future, this series will be updated with your series of nested.
If conditions permit, I hope you can test it and give me some comments.

Best regards,
Kunkun Jiang
>> Best Regards
>> Kunkun Jiang
>>
>> [1] [RFC,v7,00/26] vSMMUv3/pSMMUv3 2 stage VFIO integration
>> http://patchwork.ozlabs.org/project/qemu-devel/cover/20201116181349.11908-1-eric.auger@redhat.com/
>>
>> Kunkun Jiang (3):
>>    vfio: Introduce helpers to mark dirty pages of a RAM section
>>    vfio: Add vfio_prereg_listener_log_sync in nested stage
>>    hw/arm/smmuv3: Post-load stage 1 configurations to the host
>>
>>   hw/arm/smmuv3.c     | 60 +++++++++++++++++++++++++++++++++++++++++++++
>>   hw/arm/trace-events |  1 +
>>   hw/vfio/common.c    | 47 +++++++++++++++++++++++++++++------
>>   3 files changed, 100 insertions(+), 8 deletions(-)
>>
> .