Message ID | 20230304014343.33646-14-joao.m.martins@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | vfio/migration: Device dirty page tracking | expand |
On 3/4/23 02:43, Joao Martins wrote: > From: Avihai Horon <avihaih@nvidia.com> > > Adjust the VFIO dirty page tracking documentation and add a section to > describe device dirty page tracking. > > Signed-off-by: Avihai Horon <avihaih@nvidia.com> > Signed-off-by: Joao Martins <joao.m.martins@oracle.com> > --- > docs/devel/vfio-migration.rst | 46 +++++++++++++++++++++++------------ > 1 file changed, 31 insertions(+), 15 deletions(-) > > diff --git a/docs/devel/vfio-migration.rst b/docs/devel/vfio-migration.rst > index c214c73e2818..1b68ccf11529 100644 > --- a/docs/devel/vfio-migration.rst > +++ b/docs/devel/vfio-migration.rst > @@ -59,22 +59,37 @@ System memory dirty pages tracking > ---------------------------------- > > A ``log_global_start`` and ``log_global_stop`` memory listener callback informs > -the VFIO IOMMU module to start and stop dirty page tracking. A ``log_sync`` > -memory listener callback marks those system memory pages as dirty which are > -used for DMA by the VFIO device. The dirty pages bitmap is queried per > -container. All pages pinned by the vendor driver through external APIs have to > -be marked as dirty during migration. When there are CPU writes, CPU dirty page > -tracking can identify dirtied pages, but any page pinned by the vendor driver > -can also be written by the device. There is currently no device or IOMMU > -support for dirty page tracking in hardware. > +the VFIO dirty tracking module to start and stop dirty page tracking. A > +``log_sync`` memory listener callback queries the dirty page bitmap from the > +dirty tracking module and marks system memory pages which were DMA-ed by the > +VFIO device as dirty. The dirty page bitmap is queried per container. > + > +Currently there are two ways dirty page tracking can be done: > +(1) Device dirty tracking: > +In this method the device is responsible to log and report its DMAs. This > +method can be used only if the device is capable of tracking its DMAs. > +Discovering device capability, starting and stopping dirty tracking, and > +syncing the dirty bitmaps from the device are done using the DMA logging uAPI. > +More info about the uAPI can be found in the comments of the > +``vfio_device_feature_dma_logging_control`` and > +``vfio_device_feature_dma_logging_report`` structures in the header file > +linux-headers/linux/vfio.h. > + > +(2) VFIO IOMMU module: > +In this method dirty tracking is done by IOMMU. However, there is currently no > +IOMMU support for dirty page tracking. For this reason, all pages are > +perpetually marked dirty, unless the device driver pins pages through external > +APIs in which case only those pinned pages are perpetually marked dirty. > + > +If the above two methods are not supported, all pages are perpetually marked > +dirty by QEMU. > > By default, dirty pages are tracked during pre-copy as well as stop-and-copy > -phase. So, a page pinned by the vendor driver will be copied to the destination > -in both phases. Copying dirty pages in pre-copy phase helps QEMU to predict if > -it can achieve its downtime tolerances. If QEMU during pre-copy phase keeps > -finding dirty pages continuously, then it understands that even in stop-and-copy > -phase, it is likely to find dirty pages and can predict the downtime > -accordingly. > +phase. So, a page marked as dirty will be copied to the destination in both > +phases. Copying dirty pages in pre-copy phase helps QEMU to predict if it can > +achieve its downtime tolerances. If QEMU during pre-copy phase keeps finding > +dirty pages continuously, then it understands that even in stop-and-copy phase, > +it is likely to find dirty pages and can predict the downtime accordingly. > > QEMU also provides a per device opt-out option ``pre-copy-dirty-page-tracking`` > which disables querying the dirty bitmap during pre-copy phase. If it is set to > @@ -89,7 +104,8 @@ phase of migration. In that case, the unmap ioctl returns any dirty pages in > that range and QEMU reports corresponding guest physical pages dirty. During > stop-and-copy phase, an IOMMU notifier is used to get a callback for mapped > pages and then dirty pages bitmap is fetched from VFIO IOMMU modules for those > -mapped ranges. > +mapped ranges. If device dirty tracking is enabled with vIOMMU, live migration > +will be blocked. There is a limitation with multiple devices also. Thanks, C. > > Flow of state changes during Live migration > ===========================================
On 06/03/2023 17:15, Cédric Le Goater wrote: > On 3/4/23 02:43, Joao Martins wrote: >> From: Avihai Horon <avihaih@nvidia.com> >> >> Adjust the VFIO dirty page tracking documentation and add a section to >> describe device dirty page tracking. >> >> Signed-off-by: Avihai Horon <avihaih@nvidia.com> >> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> >> --- >> docs/devel/vfio-migration.rst | 46 +++++++++++++++++++++++------------ >> 1 file changed, 31 insertions(+), 15 deletions(-) >> >> diff --git a/docs/devel/vfio-migration.rst b/docs/devel/vfio-migration.rst >> index c214c73e2818..1b68ccf11529 100644 >> --- a/docs/devel/vfio-migration.rst >> +++ b/docs/devel/vfio-migration.rst >> @@ -59,22 +59,37 @@ System memory dirty pages tracking >> ---------------------------------- >> A ``log_global_start`` and ``log_global_stop`` memory listener callback >> informs >> -the VFIO IOMMU module to start and stop dirty page tracking. A ``log_sync`` >> -memory listener callback marks those system memory pages as dirty which are >> -used for DMA by the VFIO device. The dirty pages bitmap is queried per >> -container. All pages pinned by the vendor driver through external APIs have to >> -be marked as dirty during migration. When there are CPU writes, CPU dirty page >> -tracking can identify dirtied pages, but any page pinned by the vendor driver >> -can also be written by the device. There is currently no device or IOMMU >> -support for dirty page tracking in hardware. >> +the VFIO dirty tracking module to start and stop dirty page tracking. A >> +``log_sync`` memory listener callback queries the dirty page bitmap from the >> +dirty tracking module and marks system memory pages which were DMA-ed by the >> +VFIO device as dirty. The dirty page bitmap is queried per container. >> + >> +Currently there are two ways dirty page tracking can be done: >> +(1) Device dirty tracking: >> +In this method the device is responsible to log and report its DMAs. This >> +method can be used only if the device is capable of tracking its DMAs. >> +Discovering device capability, starting and stopping dirty tracking, and >> +syncing the dirty bitmaps from the device are done using the DMA logging uAPI. >> +More info about the uAPI can be found in the comments of the >> +``vfio_device_feature_dma_logging_control`` and >> +``vfio_device_feature_dma_logging_report`` structures in the header file >> +linux-headers/linux/vfio.h. >> + >> +(2) VFIO IOMMU module: >> +In this method dirty tracking is done by IOMMU. However, there is currently no >> +IOMMU support for dirty page tracking. For this reason, all pages are >> +perpetually marked dirty, unless the device driver pins pages through external >> +APIs in which case only those pinned pages are perpetually marked dirty. >> + >> +If the above two methods are not supported, all pages are perpetually marked >> +dirty by QEMU. >> By default, dirty pages are tracked during pre-copy as well as stop-and-copy >> -phase. So, a page pinned by the vendor driver will be copied to the destination >> -in both phases. Copying dirty pages in pre-copy phase helps QEMU to predict if >> -it can achieve its downtime tolerances. If QEMU during pre-copy phase keeps >> -finding dirty pages continuously, then it understands that even in stop-and-copy >> -phase, it is likely to find dirty pages and can predict the downtime >> -accordingly. >> +phase. So, a page marked as dirty will be copied to the destination in both >> +phases. Copying dirty pages in pre-copy phase helps QEMU to predict if it can >> +achieve its downtime tolerances. If QEMU during pre-copy phase keeps finding >> +dirty pages continuously, then it understands that even in stop-and-copy phase, >> +it is likely to find dirty pages and can predict the downtime accordingly. >> QEMU also provides a per device opt-out option >> ``pre-copy-dirty-page-tracking`` >> which disables querying the dirty bitmap during pre-copy phase. If it is set to >> @@ -89,7 +104,8 @@ phase of migration. In that case, the unmap ioctl returns >> any dirty pages in >> that range and QEMU reports corresponding guest physical pages dirty. During >> stop-and-copy phase, an IOMMU notifier is used to get a callback for mapped >> pages and then dirty pages bitmap is fetched from VFIO IOMMU modules for those >> -mapped ranges. >> +mapped ranges. If device dirty tracking is enabled with vIOMMU, live migration >> +will be blocked. > > There is a limitation with multiple devices also. > I'm aware. I just didn't write it because the section I am changing is specific to vIOMMU. > Thanks, > > C. > >> Flow of state changes during Live migration >> =========================================== >
On 06/03/2023 17:18, Joao Martins wrote: > On 06/03/2023 17:15, Cédric Le Goater wrote: >> On 3/4/23 02:43, Joao Martins wrote: >>> From: Avihai Horon <avihaih@nvidia.com> >>> >>> Adjust the VFIO dirty page tracking documentation and add a section to >>> describe device dirty page tracking. >>> >>> Signed-off-by: Avihai Horon <avihaih@nvidia.com> >>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> >>> --- >>> docs/devel/vfio-migration.rst | 46 +++++++++++++++++++++++------------ >>> 1 file changed, 31 insertions(+), 15 deletions(-) >>> >>> diff --git a/docs/devel/vfio-migration.rst b/docs/devel/vfio-migration.rst >>> index c214c73e2818..1b68ccf11529 100644 >>> --- a/docs/devel/vfio-migration.rst >>> +++ b/docs/devel/vfio-migration.rst >>> @@ -59,22 +59,37 @@ System memory dirty pages tracking >>> ---------------------------------- >>> A ``log_global_start`` and ``log_global_stop`` memory listener callback >>> informs >>> -the VFIO IOMMU module to start and stop dirty page tracking. A ``log_sync`` >>> -memory listener callback marks those system memory pages as dirty which are >>> -used for DMA by the VFIO device. The dirty pages bitmap is queried per >>> -container. All pages pinned by the vendor driver through external APIs have to >>> -be marked as dirty during migration. When there are CPU writes, CPU dirty page >>> -tracking can identify dirtied pages, but any page pinned by the vendor driver >>> -can also be written by the device. There is currently no device or IOMMU >>> -support for dirty page tracking in hardware. >>> +the VFIO dirty tracking module to start and stop dirty page tracking. A >>> +``log_sync`` memory listener callback queries the dirty page bitmap from the >>> +dirty tracking module and marks system memory pages which were DMA-ed by the >>> +VFIO device as dirty. The dirty page bitmap is queried per container. >>> + >>> +Currently there are two ways dirty page tracking can be done: >>> +(1) Device dirty tracking: >>> +In this method the device is responsible to log and report its DMAs. This >>> +method can be used only if the device is capable of tracking its DMAs. >>> +Discovering device capability, starting and stopping dirty tracking, and >>> +syncing the dirty bitmaps from the device are done using the DMA logging uAPI. >>> +More info about the uAPI can be found in the comments of the >>> +``vfio_device_feature_dma_logging_control`` and >>> +``vfio_device_feature_dma_logging_report`` structures in the header file >>> +linux-headers/linux/vfio.h. >>> + >>> +(2) VFIO IOMMU module: >>> +In this method dirty tracking is done by IOMMU. However, there is currently no >>> +IOMMU support for dirty page tracking. For this reason, all pages are >>> +perpetually marked dirty, unless the device driver pins pages through external >>> +APIs in which case only those pinned pages are perpetually marked dirty. >>> + >>> +If the above two methods are not supported, all pages are perpetually marked >>> +dirty by QEMU. >>> By default, dirty pages are tracked during pre-copy as well as stop-and-copy >>> -phase. So, a page pinned by the vendor driver will be copied to the destination >>> -in both phases. Copying dirty pages in pre-copy phase helps QEMU to predict if >>> -it can achieve its downtime tolerances. If QEMU during pre-copy phase keeps >>> -finding dirty pages continuously, then it understands that even in stop-and-copy >>> -phase, it is likely to find dirty pages and can predict the downtime >>> -accordingly. >>> +phase. So, a page marked as dirty will be copied to the destination in both >>> +phases. Copying dirty pages in pre-copy phase helps QEMU to predict if it can >>> +achieve its downtime tolerances. If QEMU during pre-copy phase keeps finding >>> +dirty pages continuously, then it understands that even in stop-and-copy phase, >>> +it is likely to find dirty pages and can predict the downtime accordingly. >>> QEMU also provides a per device opt-out option >>> ``pre-copy-dirty-page-tracking`` >>> which disables querying the dirty bitmap during pre-copy phase. If it is set to >>> @@ -89,7 +104,8 @@ phase of migration. In that case, the unmap ioctl returns >>> any dirty pages in >>> that range and QEMU reports corresponding guest physical pages dirty. During >>> stop-and-copy phase, an IOMMU notifier is used to get a callback for mapped >>> pages and then dirty pages bitmap is fetched from VFIO IOMMU modules for those >>> -mapped ranges. >>> +mapped ranges. If device dirty tracking is enabled with vIOMMU, live migration >>> +will be blocked. >> >> There is a limitation with multiple devices also. >> > I'm aware. I just didn't write it because the section I am changing is specific > to vIOMMU. > ... and this patch is covering device dirty tracking
On 3/6/23 18:18, Joao Martins wrote: > > > On 06/03/2023 17:15, Cédric Le Goater wrote: >> On 3/4/23 02:43, Joao Martins wrote: >>> From: Avihai Horon <avihaih@nvidia.com> >>> >>> Adjust the VFIO dirty page tracking documentation and add a section to >>> describe device dirty page tracking. >>> >>> Signed-off-by: Avihai Horon <avihaih@nvidia.com> >>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> >>> --- >>> docs/devel/vfio-migration.rst | 46 +++++++++++++++++++++++------------ >>> 1 file changed, 31 insertions(+), 15 deletions(-) >>> >>> diff --git a/docs/devel/vfio-migration.rst b/docs/devel/vfio-migration.rst >>> index c214c73e2818..1b68ccf11529 100644 >>> --- a/docs/devel/vfio-migration.rst >>> +++ b/docs/devel/vfio-migration.rst >>> @@ -59,22 +59,37 @@ System memory dirty pages tracking >>> ---------------------------------- >>> A ``log_global_start`` and ``log_global_stop`` memory listener callback >>> informs >>> -the VFIO IOMMU module to start and stop dirty page tracking. A ``log_sync`` >>> -memory listener callback marks those system memory pages as dirty which are >>> -used for DMA by the VFIO device. The dirty pages bitmap is queried per >>> -container. All pages pinned by the vendor driver through external APIs have to >>> -be marked as dirty during migration. When there are CPU writes, CPU dirty page >>> -tracking can identify dirtied pages, but any page pinned by the vendor driver >>> -can also be written by the device. There is currently no device or IOMMU >>> -support for dirty page tracking in hardware. >>> +the VFIO dirty tracking module to start and stop dirty page tracking. A >>> +``log_sync`` memory listener callback queries the dirty page bitmap from the >>> +dirty tracking module and marks system memory pages which were DMA-ed by the >>> +VFIO device as dirty. The dirty page bitmap is queried per container. >>> + >>> +Currently there are two ways dirty page tracking can be done: >>> +(1) Device dirty tracking: >>> +In this method the device is responsible to log and report its DMAs. This >>> +method can be used only if the device is capable of tracking its DMAs. >>> +Discovering device capability, starting and stopping dirty tracking, and >>> +syncing the dirty bitmaps from the device are done using the DMA logging uAPI. >>> +More info about the uAPI can be found in the comments of the >>> +``vfio_device_feature_dma_logging_control`` and >>> +``vfio_device_feature_dma_logging_report`` structures in the header file >>> +linux-headers/linux/vfio.h. >>> + >>> +(2) VFIO IOMMU module: >>> +In this method dirty tracking is done by IOMMU. However, there is currently no >>> +IOMMU support for dirty page tracking. For this reason, all pages are >>> +perpetually marked dirty, unless the device driver pins pages through external >>> +APIs in which case only those pinned pages are perpetually marked dirty. >>> + >>> +If the above two methods are not supported, all pages are perpetually marked >>> +dirty by QEMU. >>> By default, dirty pages are tracked during pre-copy as well as stop-and-copy >>> -phase. So, a page pinned by the vendor driver will be copied to the destination >>> -in both phases. Copying dirty pages in pre-copy phase helps QEMU to predict if >>> -it can achieve its downtime tolerances. If QEMU during pre-copy phase keeps >>> -finding dirty pages continuously, then it understands that even in stop-and-copy >>> -phase, it is likely to find dirty pages and can predict the downtime >>> -accordingly. >>> +phase. So, a page marked as dirty will be copied to the destination in both >>> +phases. Copying dirty pages in pre-copy phase helps QEMU to predict if it can >>> +achieve its downtime tolerances. If QEMU during pre-copy phase keeps finding >>> +dirty pages continuously, then it understands that even in stop-and-copy phase, >>> +it is likely to find dirty pages and can predict the downtime accordingly. >>> QEMU also provides a per device opt-out option >>> ``pre-copy-dirty-page-tracking`` >>> which disables querying the dirty bitmap during pre-copy phase. If it is set to >>> @@ -89,7 +104,8 @@ phase of migration. In that case, the unmap ioctl returns >>> any dirty pages in >>> that range and QEMU reports corresponding guest physical pages dirty. During >>> stop-and-copy phase, an IOMMU notifier is used to get a callback for mapped >>> pages and then dirty pages bitmap is fetched from VFIO IOMMU modules for those >>> -mapped ranges. >>> +mapped ranges. If device dirty tracking is enabled with vIOMMU, live migration >>> +will be blocked. >> >> There is a limitation with multiple devices also. >> > I'm aware. I just didn't write it because the section I am changing is specific > to vIOMMU. Ah OK. I didn't check, sorry. Reviewed-by: Cédric Le Goater <clg@redhat.com> Thanks, C.
diff --git a/docs/devel/vfio-migration.rst b/docs/devel/vfio-migration.rst index c214c73e2818..1b68ccf11529 100644 --- a/docs/devel/vfio-migration.rst +++ b/docs/devel/vfio-migration.rst @@ -59,22 +59,37 @@ System memory dirty pages tracking ---------------------------------- A ``log_global_start`` and ``log_global_stop`` memory listener callback informs -the VFIO IOMMU module to start and stop dirty page tracking. A ``log_sync`` -memory listener callback marks those system memory pages as dirty which are -used for DMA by the VFIO device. The dirty pages bitmap is queried per -container. All pages pinned by the vendor driver through external APIs have to -be marked as dirty during migration. When there are CPU writes, CPU dirty page -tracking can identify dirtied pages, but any page pinned by the vendor driver -can also be written by the device. There is currently no device or IOMMU -support for dirty page tracking in hardware. +the VFIO dirty tracking module to start and stop dirty page tracking. A +``log_sync`` memory listener callback queries the dirty page bitmap from the +dirty tracking module and marks system memory pages which were DMA-ed by the +VFIO device as dirty. The dirty page bitmap is queried per container. + +Currently there are two ways dirty page tracking can be done: +(1) Device dirty tracking: +In this method the device is responsible to log and report its DMAs. This +method can be used only if the device is capable of tracking its DMAs. +Discovering device capability, starting and stopping dirty tracking, and +syncing the dirty bitmaps from the device are done using the DMA logging uAPI. +More info about the uAPI can be found in the comments of the +``vfio_device_feature_dma_logging_control`` and +``vfio_device_feature_dma_logging_report`` structures in the header file +linux-headers/linux/vfio.h. + +(2) VFIO IOMMU module: +In this method dirty tracking is done by IOMMU. However, there is currently no +IOMMU support for dirty page tracking. For this reason, all pages are +perpetually marked dirty, unless the device driver pins pages through external +APIs in which case only those pinned pages are perpetually marked dirty. + +If the above two methods are not supported, all pages are perpetually marked +dirty by QEMU. By default, dirty pages are tracked during pre-copy as well as stop-and-copy -phase. So, a page pinned by the vendor driver will be copied to the destination -in both phases. Copying dirty pages in pre-copy phase helps QEMU to predict if -it can achieve its downtime tolerances. If QEMU during pre-copy phase keeps -finding dirty pages continuously, then it understands that even in stop-and-copy -phase, it is likely to find dirty pages and can predict the downtime -accordingly. +phase. So, a page marked as dirty will be copied to the destination in both +phases. Copying dirty pages in pre-copy phase helps QEMU to predict if it can +achieve its downtime tolerances. If QEMU during pre-copy phase keeps finding +dirty pages continuously, then it understands that even in stop-and-copy phase, +it is likely to find dirty pages and can predict the downtime accordingly. QEMU also provides a per device opt-out option ``pre-copy-dirty-page-tracking`` which disables querying the dirty bitmap during pre-copy phase. If it is set to @@ -89,7 +104,8 @@ phase of migration. In that case, the unmap ioctl returns any dirty pages in that range and QEMU reports corresponding guest physical pages dirty. During stop-and-copy phase, an IOMMU notifier is used to get a callback for mapped pages and then dirty pages bitmap is fetched from VFIO IOMMU modules for those -mapped ranges. +mapped ranges. If device dirty tracking is enabled with vIOMMU, live migration +will be blocked. Flow of state changes during Live migration ===========================================