Message ID | 20221207133646.635760-1-eric.auger@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | virtio-iommu: Fix Replay | expand |
Hi, Eric, On Wed, Dec 07, 2022 at 02:36:44PM +0100, Eric Auger wrote: > When assigning VFIO devices protected by a virtio-iommu we need to replay > the mappings when adding a new IOMMU MR and when attaching a device to > a domain. While we do a "remap" we currently fail to first unmap the > existing IOVA mapping and just map the new one. With some device/group > topology this can lead to errors in VFIO when trying to DMA_MAP IOVA > ranges onto existing ones. I'm not sure whether virtio-iommu+vfio will suffer from DMA races like when we were working on the vt-d replay for vfio. The issue is whether DMA can happen right after UNMAP but before MAP of the same page if the page was always mapped. The vt-d resolved it by using iova_tree so in a replay vt-d knows the page didn't change, so it avoids unmap+map. It only notifies newly unmapped or newly mapped. Thanks,
Hi Peter, On 12/8/22 00:49, Peter Xu wrote: > Hi, Eric, > > On Wed, Dec 07, 2022 at 02:36:44PM +0100, Eric Auger wrote: >> When assigning VFIO devices protected by a virtio-iommu we need to replay >> the mappings when adding a new IOMMU MR and when attaching a device to >> a domain. While we do a "remap" we currently fail to first unmap the >> existing IOVA mapping and just map the new one. With some device/group >> topology this can lead to errors in VFIO when trying to DMA_MAP IOVA >> ranges onto existing ones. > I'm not sure whether virtio-iommu+vfio will suffer from DMA races like when > we were working on the vt-d replay for vfio. The issue is whether DMA can > happen right after UNMAP but before MAP of the same page if the page was > always mapped. I don't think it can race because a mutex is hold while doing the virtio_iommu_replay(), and each time a virtio cmd is handled (attach, map, unmap), see virtio_iommu_handle_command. So I think it is safe. Thanks Eric > > The vt-d resolved it by using iova_tree so in a replay vt-d knows the page > didn't change, so it avoids unmap+map. It only notifies newly unmapped or > newly mapped. > > Thanks, >
On Thu, Dec 08, 2022 at 08:48:09AM +0100, Eric Auger wrote: > Hi Peter, Hi, Eric, > > On 12/8/22 00:49, Peter Xu wrote: > > Hi, Eric, > > > > On Wed, Dec 07, 2022 at 02:36:44PM +0100, Eric Auger wrote: > >> When assigning VFIO devices protected by a virtio-iommu we need to replay > >> the mappings when adding a new IOMMU MR and when attaching a device to > >> a domain. While we do a "remap" we currently fail to first unmap the > >> existing IOVA mapping and just map the new one. With some device/group > >> topology this can lead to errors in VFIO when trying to DMA_MAP IOVA > >> ranges onto existing ones. > > I'm not sure whether virtio-iommu+vfio will suffer from DMA races like when > > we were working on the vt-d replay for vfio. The issue is whether DMA can > > happen right after UNMAP but before MAP of the same page if the page was > > always mapped. > > I don't think it can race because a mutex is hold while doing the > virtio_iommu_replay(), and each time a virtio cmd is handled (attach, > map, unmap), see virtio_iommu_handle_command. > So I think it is safe. It's not the race in the code, it's the race between modifying host IOMMU pgtable with DMA happening in parallel. The bug triggered with DMA_MAP returning -EEXIST means there's existing mapping. If during replay there's mapped ranges and the ranges are prone to DMA, then IIUC it can happen. I didn't really check specifically for virtio-iommu and I mostly forget the details, just to raise this up. It's possible for some reason it just can't trigger. VT-d definitely can, in which case we'll see DMA errors on the host from the assigned device when the DMA triggers during the "unmap and map" window. Thanks,
On Thu, Dec 08, 2022 at 09:58:06AM -0500, Peter Xu wrote: > On Thu, Dec 08, 2022 at 08:48:09AM +0100, Eric Auger wrote: > > Hi Peter, > > Hi, Eric, > > > > > On 12/8/22 00:49, Peter Xu wrote: > > > Hi, Eric, > > > > > > On Wed, Dec 07, 2022 at 02:36:44PM +0100, Eric Auger wrote: > > >> When assigning VFIO devices protected by a virtio-iommu we need to replay > > >> the mappings when adding a new IOMMU MR and when attaching a device to > > >> a domain. While we do a "remap" we currently fail to first unmap the > > >> existing IOVA mapping and just map the new one. With some device/group > > >> topology this can lead to errors in VFIO when trying to DMA_MAP IOVA > > >> ranges onto existing ones. > > > I'm not sure whether virtio-iommu+vfio will suffer from DMA races like when > > > we were working on the vt-d replay for vfio. The issue is whether DMA can > > > happen right after UNMAP but before MAP of the same page if the page was > > > always mapped. > > > > I don't think it can race because a mutex is hold while doing the > > virtio_iommu_replay(), and each time a virtio cmd is handled (attach, > > map, unmap), see virtio_iommu_handle_command. > > So I think it is safe. > > It's not the race in the code, it's the race between modifying host IOMMU > pgtable with DMA happening in parallel. The bug triggered with DMA_MAP > returning -EEXIST means there's existing mapping. > > If during replay there's mapped ranges and the ranges are prone to DMA, > then IIUC it can happen. > > I didn't really check specifically for virtio-iommu and I mostly forget the > details, just to raise this up. It's possible for some reason it just > can't trigger. VT-d definitely can, in which case we'll see DMA errors on > the host from the assigned device when the DMA triggers during the "unmap > and map" window. > > Thanks, Eric any resolution on this? > -- > Peter Xu
Hi Michael, Peter, On 12/20/22 15:59, Michael S. Tsirkin wrote: > On Thu, Dec 08, 2022 at 09:58:06AM -0500, Peter Xu wrote: >> On Thu, Dec 08, 2022 at 08:48:09AM +0100, Eric Auger wrote: >>> Hi Peter, >> Hi, Eric, >> >>> On 12/8/22 00:49, Peter Xu wrote: >>>> Hi, Eric, >>>> >>>> On Wed, Dec 07, 2022 at 02:36:44PM +0100, Eric Auger wrote: >>>>> When assigning VFIO devices protected by a virtio-iommu we need to replay >>>>> the mappings when adding a new IOMMU MR and when attaching a device to >>>>> a domain. While we do a "remap" we currently fail to first unmap the >>>>> existing IOVA mapping and just map the new one. With some device/group >>>>> topology this can lead to errors in VFIO when trying to DMA_MAP IOVA >>>>> ranges onto existing ones. >>>> I'm not sure whether virtio-iommu+vfio will suffer from DMA races like when >>>> we were working on the vt-d replay for vfio. The issue is whether DMA can >>>> happen right after UNMAP but before MAP of the same page if the page was >>>> always mapped. >>> I don't think it can race because a mutex is hold while doing the >>> virtio_iommu_replay(), and each time a virtio cmd is handled (attach, >>> map, unmap), see virtio_iommu_handle_command. >>> So I think it is safe. >> It's not the race in the code, it's the race between modifying host IOMMU >> pgtable with DMA happening in parallel. The bug triggered with DMA_MAP >> returning -EEXIST means there's existing mapping. >> >> If during replay there's mapped ranges and the ranges are prone to DMA, >> then IIUC it can happen. >> >> I didn't really check specifically for virtio-iommu and I mostly forget the >> details, just to raise this up. It's possible for some reason it just >> can't trigger. VT-d definitely can, in which case we'll see DMA errors on >> the host from the assigned device when the DMA triggers during the "unmap >> and map" window. >> >> Thanks, > Eric any resolution on this? Sorry for the delay. Not yet unfortunately. following Peter's reply I now understand the race issue and it makes total sense but I need to study it further. Eric > >> -- >> Peter Xu