Message ID | 20230222174915.5647-21-avihaih@nvidia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | vfio: Add migration pre-copy support and device dirty tracking | expand |
On 2/22/23 18:49, Avihai Horon wrote: > Adjust the VFIO dirty page tracking documentation and add a section to > describe device dirty page tracking. > > Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Thanks, C. > --- > docs/devel/vfio-migration.rst | 50 ++++++++++++++++++++++------------- > 1 file changed, 32 insertions(+), 18 deletions(-) > > diff --git a/docs/devel/vfio-migration.rst b/docs/devel/vfio-migration.rst > index ba80b9150d..a432cda081 100644 > --- a/docs/devel/vfio-migration.rst > +++ b/docs/devel/vfio-migration.rst > @@ -71,22 +71,37 @@ System memory dirty pages tracking > ---------------------------------- > > A ``log_global_start`` and ``log_global_stop`` memory listener callback informs > -the VFIO IOMMU module to start and stop dirty page tracking. A ``log_sync`` > -memory listener callback marks those system memory pages as dirty which are > -used for DMA by the VFIO device. The dirty pages bitmap is queried per > -container. All pages pinned by the vendor driver through external APIs have to > -be marked as dirty during migration. When there are CPU writes, CPU dirty page > -tracking can identify dirtied pages, but any page pinned by the vendor driver > -can also be written by the device. There is currently no device or IOMMU > -support for dirty page tracking in hardware. > +the VFIO dirty tracking module to start and stop dirty page tracking. A > +``log_sync`` memory listener callback queries the dirty page bitmap from the > +dirty tracking module and marks system memory pages which were DMA-ed by the > +VFIO device as dirty. The dirty page bitmap is queried per container. > + > +Currently there are two ways dirty page tracking can be done: > +(1) Device dirty tracking: > +In this method the device is responsible to log and report its DMAs. This > +method can be used only if the device is capable of tracking its DMAs. > +Discovering device capability, starting and stopping dirty tracking, and > +syncing the dirty bitmaps from the device are done using the DMA logging uAPI. > +More info about the uAPI can be found in the comments of the > +``vfio_device_feature_dma_logging_control`` and > +``vfio_device_feature_dma_logging_report`` structures in the header file > +linux-headers/linux/vfio.h. > + > +(2) VFIO IOMMU module: > +In this method dirty tracking is done by IOMMU. However, there is currently no > +IOMMU support for dirty page tracking. For this reason, all pages are > +perpetually marked dirty, unless the device driver pins pages through external > +APIs in which case only those pinned pages are perpetually marked dirty. > + > +If the above two methods are not supported, all pages are perpetually marked > +dirty by QEMU. > > By default, dirty pages are tracked during pre-copy as well as stop-and-copy > -phase. So, a page pinned by the vendor driver will be copied to the destination > -in both phases. Copying dirty pages in pre-copy phase helps QEMU to predict if > -it can achieve its downtime tolerances. If QEMU during pre-copy phase keeps > -finding dirty pages continuously, then it understands that even in stop-and-copy > -phase, it is likely to find dirty pages and can predict the downtime > -accordingly. > +phase. So, a page marked as dirty will be copied to the destination in both > +phases. Copying dirty pages in pre-copy phase helps QEMU to predict if it can > +achieve its downtime tolerances. If QEMU during pre-copy phase keeps finding > +dirty pages continuously, then it understands that even in stop-and-copy phase, > +it is likely to find dirty pages and can predict the downtime accordingly. > > QEMU also provides a per device opt-out option ``pre-copy-dirty-page-tracking`` > which disables querying the dirty bitmap during pre-copy phase. If it is set to > @@ -97,10 +112,9 @@ System memory dirty pages tracking when vIOMMU is enabled > --------------------------------------------------------- > > With vIOMMU, an IO virtual address range can get unmapped while in pre-copy > -phase of migration. In that case, the unmap ioctl returns any dirty pages in > -that range and QEMU reports corresponding guest physical pages dirty. During > -stop-and-copy phase, an IOMMU notifier is used to get a callback for mapped > -pages and then dirty pages bitmap is fetched from VFIO IOMMU modules for those > +phase of migration. In that case, dirty page bitmap for this range is queried > +and synced with QEMU. During stop-and-copy phase, an IOMMU notifier is used to > +get a callback for mapped pages and then dirty page bitmap is fetched for those > mapped ranges. > > Flow of state changes during Live migration
diff --git a/docs/devel/vfio-migration.rst b/docs/devel/vfio-migration.rst index ba80b9150d..a432cda081 100644 --- a/docs/devel/vfio-migration.rst +++ b/docs/devel/vfio-migration.rst @@ -71,22 +71,37 @@ System memory dirty pages tracking ---------------------------------- A ``log_global_start`` and ``log_global_stop`` memory listener callback informs -the VFIO IOMMU module to start and stop dirty page tracking. A ``log_sync`` -memory listener callback marks those system memory pages as dirty which are -used for DMA by the VFIO device. The dirty pages bitmap is queried per -container. All pages pinned by the vendor driver through external APIs have to -be marked as dirty during migration. When there are CPU writes, CPU dirty page -tracking can identify dirtied pages, but any page pinned by the vendor driver -can also be written by the device. There is currently no device or IOMMU -support for dirty page tracking in hardware. +the VFIO dirty tracking module to start and stop dirty page tracking. A +``log_sync`` memory listener callback queries the dirty page bitmap from the +dirty tracking module and marks system memory pages which were DMA-ed by the +VFIO device as dirty. The dirty page bitmap is queried per container. + +Currently there are two ways dirty page tracking can be done: +(1) Device dirty tracking: +In this method the device is responsible to log and report its DMAs. This +method can be used only if the device is capable of tracking its DMAs. +Discovering device capability, starting and stopping dirty tracking, and +syncing the dirty bitmaps from the device are done using the DMA logging uAPI. +More info about the uAPI can be found in the comments of the +``vfio_device_feature_dma_logging_control`` and +``vfio_device_feature_dma_logging_report`` structures in the header file +linux-headers/linux/vfio.h. + +(2) VFIO IOMMU module: +In this method dirty tracking is done by IOMMU. However, there is currently no +IOMMU support for dirty page tracking. For this reason, all pages are +perpetually marked dirty, unless the device driver pins pages through external +APIs in which case only those pinned pages are perpetually marked dirty. + +If the above two methods are not supported, all pages are perpetually marked +dirty by QEMU. By default, dirty pages are tracked during pre-copy as well as stop-and-copy -phase. So, a page pinned by the vendor driver will be copied to the destination -in both phases. Copying dirty pages in pre-copy phase helps QEMU to predict if -it can achieve its downtime tolerances. If QEMU during pre-copy phase keeps -finding dirty pages continuously, then it understands that even in stop-and-copy -phase, it is likely to find dirty pages and can predict the downtime -accordingly. +phase. So, a page marked as dirty will be copied to the destination in both +phases. Copying dirty pages in pre-copy phase helps QEMU to predict if it can +achieve its downtime tolerances. If QEMU during pre-copy phase keeps finding +dirty pages continuously, then it understands that even in stop-and-copy phase, +it is likely to find dirty pages and can predict the downtime accordingly. QEMU also provides a per device opt-out option ``pre-copy-dirty-page-tracking`` which disables querying the dirty bitmap during pre-copy phase. If it is set to @@ -97,10 +112,9 @@ System memory dirty pages tracking when vIOMMU is enabled --------------------------------------------------------- With vIOMMU, an IO virtual address range can get unmapped while in pre-copy -phase of migration. In that case, the unmap ioctl returns any dirty pages in -that range and QEMU reports corresponding guest physical pages dirty. During -stop-and-copy phase, an IOMMU notifier is used to get a callback for mapped -pages and then dirty pages bitmap is fetched from VFIO IOMMU modules for those +phase of migration. In that case, dirty page bitmap for this range is queried +and synced with QEMU. During stop-and-copy phase, an IOMMU notifier is used to +get a callback for mapped pages and then dirty page bitmap is fetched for those mapped ranges. Flow of state changes during Live migration
Adjust the VFIO dirty page tracking documentation and add a section to describe device dirty page tracking. Signed-off-by: Avihai Horon <avihaih@nvidia.com> --- docs/devel/vfio-migration.rst | 50 ++++++++++++++++++++++------------- 1 file changed, 32 insertions(+), 18 deletions(-)