Message ID | 20240805062727.2307552-17-zhenzhong.duan@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | intel_iommu: Enable stage-1 translation for emulated device | expand |
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com> On 05/08/2024 08:27, Zhenzhong Duan wrote: > Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe. > > > When host IOMMU doesn't support FS1GP but vIOMMU does, host IOMMU > can't translate stage-1 page table from guest correctly. > > Add a property x-cap-fs1gp for user to turn FS1GP off so that > nested page table on host side works. > > This property has no effect when vIOMMU isn't in scalable modern > mode. > > Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> > --- > include/hw/i386/intel_iommu.h | 1 + > hw/i386/intel_iommu.c | 5 ++++- > 2 files changed, 5 insertions(+), 1 deletion(-) > > diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h > index 650641544c..f6d9b41b80 100644 > --- a/include/hw/i386/intel_iommu.h > +++ b/include/hw/i386/intel_iommu.h > @@ -308,6 +308,7 @@ struct IntelIOMMUState { > bool dma_drain; /* Whether DMA r/w draining enabled */ > bool dma_translation; /* Whether DMA translation supported */ > bool pasid; /* Whether to support PASID */ > + bool fs1gp; /* First Stage 1-GByte Page Support */ > > /* > * Protects IOMMU states in general. Currently it protects the > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c > index 9e973bd710..d7e7354db4 100644 > --- a/hw/i386/intel_iommu.c > +++ b/hw/i386/intel_iommu.c > @@ -3778,6 +3778,7 @@ static Property vtd_properties[] = { > DEFINE_PROP_BOOL("x-pasid-mode", IntelIOMMUState, pasid, false), > DEFINE_PROP_BOOL("dma-drain", IntelIOMMUState, dma_drain, true), > DEFINE_PROP_BOOL("dma-translation", IntelIOMMUState, dma_translation, true), > + DEFINE_PROP_BOOL("x-cap-fs1gp", IntelIOMMUState, fs1gp, true), > DEFINE_PROP_END_OF_LIST(), > }; > > @@ -4506,7 +4507,9 @@ static void vtd_cap_init(IntelIOMMUState *s) > /* TODO: read cap/ecap from host to decide which cap to be exposed. */ > if (s->scalable_modern) { > s->ecap |= VTD_ECAP_SMTS | VTD_ECAP_FLTS; > - s->cap |= VTD_CAP_FS1GP; > + if (s->fs1gp) { > + s->cap |= VTD_CAP_FS1GP; > + } > } else if (s->scalable_mode) { > s->ecap |= VTD_ECAP_SMTS | VTD_ECAP_SRS | VTD_ECAP_SLTS; > } > -- > 2.34.1 >
On 2024/8/5 14:27, Zhenzhong Duan wrote: > When host IOMMU doesn't support FS1GP but vIOMMU does, host IOMMU > can't translate stage-1 page table from guest correctly. this series is for emulated devices, so the above statement does not belong to this series. Is there any other reason to have this option? > Add a property x-cap-fs1gp for user to turn FS1GP off so that > nested page table on host side works. I guess you would need to sync the FS1GP cap with host before reporting it in vIOMMU when comes to support passthrough devices. > This property has no effect when vIOMMU isn't in scalable modern > mode. > > Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> > --- > include/hw/i386/intel_iommu.h | 1 + > hw/i386/intel_iommu.c | 5 ++++- > 2 files changed, 5 insertions(+), 1 deletion(-) > > diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h > index 650641544c..f6d9b41b80 100644 > --- a/include/hw/i386/intel_iommu.h > +++ b/include/hw/i386/intel_iommu.h > @@ -308,6 +308,7 @@ struct IntelIOMMUState { > bool dma_drain; /* Whether DMA r/w draining enabled */ > bool dma_translation; /* Whether DMA translation supported */ > bool pasid; /* Whether to support PASID */ > + bool fs1gp; /* First Stage 1-GByte Page Support */ > > /* > * Protects IOMMU states in general. Currently it protects the > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c > index 9e973bd710..d7e7354db4 100644 > --- a/hw/i386/intel_iommu.c > +++ b/hw/i386/intel_iommu.c > @@ -3778,6 +3778,7 @@ static Property vtd_properties[] = { > DEFINE_PROP_BOOL("x-pasid-mode", IntelIOMMUState, pasid, false), > DEFINE_PROP_BOOL("dma-drain", IntelIOMMUState, dma_drain, true), > DEFINE_PROP_BOOL("dma-translation", IntelIOMMUState, dma_translation, true), > + DEFINE_PROP_BOOL("x-cap-fs1gp", IntelIOMMUState, fs1gp, true), > DEFINE_PROP_END_OF_LIST(), > }; > > @@ -4506,7 +4507,9 @@ static void vtd_cap_init(IntelIOMMUState *s) > /* TODO: read cap/ecap from host to decide which cap to be exposed. */ > if (s->scalable_modern) { > s->ecap |= VTD_ECAP_SMTS | VTD_ECAP_FLTS; > - s->cap |= VTD_CAP_FS1GP; > + if (s->fs1gp) { > + s->cap |= VTD_CAP_FS1GP; > + } > } else if (s->scalable_mode) { > s->ecap |= VTD_ECAP_SMTS | VTD_ECAP_SRS | VTD_ECAP_SLTS; > }
>-----Original Message----- >From: Liu, Yi L <yi.l.liu@intel.com> >Subject: Re: [PATCH v2 16/17] intel_iommu: Introduce a property to control >FS1GP cap bit setting > >On 2024/8/5 14:27, Zhenzhong Duan wrote: >> When host IOMMU doesn't support FS1GP but vIOMMU does, host >IOMMU >> can't translate stage-1 page table from guest correctly. > >this series is for emulated devices, so the above statement does not >belong to this series. Is there any other reason to have this option? Good catch, will remove this comment. In fact, this patch is mainly for passthrough device where host IOMMU doesn't support fs1gp. > >> Add a property x-cap-fs1gp for user to turn FS1GP off so that >> nested page table on host side works. > >I guess you would need to sync the FS1GP cap with host before reporting it >in vIOMMU when comes to support passthrough devices. Yes, we already have this check, see https://github.com/yiliu1765/qemu/commit/b7ac7ce3a2e21eb1b3172743ee6f73e80fe67b3a Thanks Zhenzhong
On 2024/8/15 11:46, Duan, Zhenzhong wrote: > > >> -----Original Message----- >> From: Liu, Yi L <yi.l.liu@intel.com> >> Subject: Re: [PATCH v2 16/17] intel_iommu: Introduce a property to control >> FS1GP cap bit setting >> >> On 2024/8/5 14:27, Zhenzhong Duan wrote: >>> When host IOMMU doesn't support FS1GP but vIOMMU does, host >> IOMMU >>> can't translate stage-1 page table from guest correctly. >> >> this series is for emulated devices, so the above statement does not >> belong to this series. Is there any other reason to have this option? > > Good catch, will remove this comment. > In fact, this patch is mainly for passthrough device where host IOMMU doesn't support fs1gp. I see. To me, as long as the vIOMMU page walk logic supports 1GP large pages, it's ok to report the FS1GP cap to VM. But it is still fine to have this property to opt-out FS1GP if admin/orchestration layer(e.g. libvirt) knows no hw iommu has this capability, so it is better to opt out it before invoking QEMU. Is this your motivation for this property? >> >>> Add a property x-cap-fs1gp for user to turn FS1GP off so that >>> nested page table on host side works. >> >> I guess you would need to sync the FS1GP cap with host before reporting it >> in vIOMMU when comes to support passthrough devices. > > Yes, we already have this check, see https://github.com/yiliu1765/qemu/commit/b7ac7ce3a2e21eb1b3172743ee6f73e80fe67b3a good to know it. :) Will you fail the VM if the device's iommu does not support FS1GP or just mask out the FS1GP?
>-----Original Message----- >From: Liu, Yi L <yi.l.liu@intel.com> >Subject: Re: [PATCH v2 16/17] intel_iommu: Introduce a property to control >FS1GP cap bit setting > >On 2024/8/15 11:46, Duan, Zhenzhong wrote: >> >> >>> -----Original Message----- >>> From: Liu, Yi L <yi.l.liu@intel.com> >>> Subject: Re: [PATCH v2 16/17] intel_iommu: Introduce a property to >control >>> FS1GP cap bit setting >>> >>> On 2024/8/5 14:27, Zhenzhong Duan wrote: >>>> When host IOMMU doesn't support FS1GP but vIOMMU does, host >>> IOMMU >>>> can't translate stage-1 page table from guest correctly. >>> >>> this series is for emulated devices, so the above statement does not >>> belong to this series. Is there any other reason to have this option? >> >> Good catch, will remove this comment. >> In fact, this patch is mainly for passthrough device where host IOMMU >doesn't support fs1gp. > >I see. To me, as long as the vIOMMU page walk logic supports 1GP large >pages, it's ok to report the FS1GP cap to VM. But it is still fine to >have this property to opt-out FS1GP if admin/orchestration layer(e.g. libvirt) >knows no hw iommu has this capability, so it is better to opt out it >before invoking QEMU. > >Is this your motivation for this property? Exactly. > >>> >>>> Add a property x-cap-fs1gp for user to turn FS1GP off so that >>>> nested page table on host side works. >>> >>> I guess you would need to sync the FS1GP cap with host before reporting >it >>> in vIOMMU when comes to support passthrough devices. >> >> Yes, we already have this check, see >https://github.com/yiliu1765/qemu/commit/b7ac7ce3a2e21eb1b3172743 >ee6f73e80fe67b3a > >good to know it. :) Will you fail the VM if the device's iommu does not >support FS1GP or just mask out the FS1GP? For cold plugged VFIO device, it will fail the VM with "Stage-1 1GB huge page is unsupported by host IOMMU" error report. For hotplug VFIO device, only hotplug fails with "Stage-1 1GB huge page is unsupported by host IOMMU". We don't update vIOMMU cap/ecap from host cap/ecap per Michael's suggestion, only vIOMMU properties can control them. Thanks Zhenzhong
On 2024/8/19 17:41, Duan, Zhenzhong wrote: > > >> -----Original Message----- >> From: Liu, Yi L <yi.l.liu@intel.com> >> Subject: Re: [PATCH v2 16/17] intel_iommu: Introduce a property to control >> FS1GP cap bit setting >> >> On 2024/8/15 11:46, Duan, Zhenzhong wrote: >>> >>> >>>> -----Original Message----- >>>> From: Liu, Yi L <yi.l.liu@intel.com> >>>> Subject: Re: [PATCH v2 16/17] intel_iommu: Introduce a property to >> control >>>> FS1GP cap bit setting >>>> >>>> On 2024/8/5 14:27, Zhenzhong Duan wrote: >>>>> When host IOMMU doesn't support FS1GP but vIOMMU does, host >>>> IOMMU >>>>> can't translate stage-1 page table from guest correctly. >>>> >>>> this series is for emulated devices, so the above statement does not >>>> belong to this series. Is there any other reason to have this option? >>> >>> Good catch, will remove this comment. >>> In fact, this patch is mainly for passthrough device where host IOMMU >> doesn't support fs1gp. >> >> I see. To me, as long as the vIOMMU page walk logic supports 1GP large >> pages, it's ok to report the FS1GP cap to VM. But it is still fine to >> have this property to opt-out FS1GP if admin/orchestration layer(e.g. libvirt) >> knows no hw iommu has this capability, so it is better to opt out it >> before invoking QEMU. >> >> Is this your motivation for this property? > > Exactly. ok, then let's keep it in this series after refining the commit message a bit. >> >>>> >>>>> Add a property x-cap-fs1gp for user to turn FS1GP off so that >>>>> nested page table on host side works. >>>> >>>> I guess you would need to sync the FS1GP cap with host before reporting >> it >>>> in vIOMMU when comes to support passthrough devices. >>> >>> Yes, we already have this check, see >> https://github.com/yiliu1765/qemu/commit/b7ac7ce3a2e21eb1b3172743 >> ee6f73e80fe67b3a >> >> good to know it. :) Will you fail the VM if the device's iommu does not >> support FS1GP or just mask out the FS1GP? > > For cold plugged VFIO device, it will fail the VM with "Stage-1 1GB huge page is unsupported by host IOMMU" error report. > For hotplug VFIO device, only hotplug fails with "Stage-1 1GB huge page is unsupported by host IOMMU". > > We don't update vIOMMU cap/ecap from host cap/ecap per Michael's suggestion, only vIOMMU properties can control them. I see. yeah, it makes sense.
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h index 650641544c..f6d9b41b80 100644 --- a/include/hw/i386/intel_iommu.h +++ b/include/hw/i386/intel_iommu.h @@ -308,6 +308,7 @@ struct IntelIOMMUState { bool dma_drain; /* Whether DMA r/w draining enabled */ bool dma_translation; /* Whether DMA translation supported */ bool pasid; /* Whether to support PASID */ + bool fs1gp; /* First Stage 1-GByte Page Support */ /* * Protects IOMMU states in general. Currently it protects the diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 9e973bd710..d7e7354db4 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -3778,6 +3778,7 @@ static Property vtd_properties[] = { DEFINE_PROP_BOOL("x-pasid-mode", IntelIOMMUState, pasid, false), DEFINE_PROP_BOOL("dma-drain", IntelIOMMUState, dma_drain, true), DEFINE_PROP_BOOL("dma-translation", IntelIOMMUState, dma_translation, true), + DEFINE_PROP_BOOL("x-cap-fs1gp", IntelIOMMUState, fs1gp, true), DEFINE_PROP_END_OF_LIST(), }; @@ -4506,7 +4507,9 @@ static void vtd_cap_init(IntelIOMMUState *s) /* TODO: read cap/ecap from host to decide which cap to be exposed. */ if (s->scalable_modern) { s->ecap |= VTD_ECAP_SMTS | VTD_ECAP_FLTS; - s->cap |= VTD_CAP_FS1GP; + if (s->fs1gp) { + s->cap |= VTD_CAP_FS1GP; + } } else if (s->scalable_mode) { s->ecap |= VTD_ECAP_SMTS | VTD_ECAP_SRS | VTD_ECAP_SLTS; }
When host IOMMU doesn't support FS1GP but vIOMMU does, host IOMMU can't translate stage-1 page table from guest correctly. Add a property x-cap-fs1gp for user to turn FS1GP off so that nested page table on host side works. This property has no effect when vIOMMU isn't in scalable modern mode. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> --- include/hw/i386/intel_iommu.h | 1 + hw/i386/intel_iommu.c | 5 ++++- 2 files changed, 5 insertions(+), 1 deletion(-)